OOCOBSn BSSOHB 



BO 'U6 195 



IHSTITOTIOH. 
POB DATE 
HOTE 



EDBS PRICE . 
DESfcfilPTOES 



IDEHTIFIEBS 



ABSTBACT 



' Tfl 00 6 569 

*aintze^, Joel J. ' . * * • 

• '^ield Test a,n4 ?aliciation pf a Teaching. Evaluation 
Instruaent: The Student Opinion Survey of, Teaching* A 
Report Subnitted to the Senate Coaaittee for Teaching 
and Learning, Faculty Senate, Oniviercity of aindsor, 
ifMidsor, Ontario, / * ' 

Windsor Oniv* (Ontario)* 
[■77] , w 

97p.; Pages 85 through 119 or the original -docuaent 
are copyrighted and therefore not available* They are 
niDt incl\i'dea in the pagination 

MF-$0^83 flC-$4.67 Plus Postage. 

Academic Achieveaent; Class Siate; ♦College Students; 
College Teachers; ♦Course Evaluatiph; Higher 
Education ; Predictot fariables; ♦Questionnaires; 
Student Characteristics^ ♦student Evaluation" 
Teacher Perforaance; Teacher Characteristics; ♦^Cest 
Reliability; *Test falidity 
♦Student Opinion Su^ej^xpf Teaching 



:es; 



The reliability and validity of the Student Opinion 
Survey of Teaching (SOST) were assessed, and normative data were 
provided for judging its value in the evaluation of, faculty teaching* 
Data were collected fro« 2^229 students enrolled In 93 trasses taught 
by 53 instructors in 12 acadeaic disciplines at .the University of 
Windsor, Ontario. Internal consistency of the $OST was moderate to 
relatively high on three sections; however alpha coefficients for two 
sections were unacceptably lOw* Low but significant positive 
correlations were found bet weeti eleven' of the SOST iteas and student 
achieveaent in an introductory psychology course* These findings ' 
provide^ evidence tlhat the instrument possesses a certain degree of • 
.criterion- related validity. Five factors accounted for 55JI of the 
variance in^ item responses: instructional skill, student teacher 
interaction, work load, instructor's organization of the course, and 
feedback. Results indicated that a student's major may affect his or 
her evaluations of courses and instructors, upper-lc^vel students tend 
to rate instructors more favorably than lower-level student^, 
superior or above average students tend to give their xin^tructors 
better ratings, and elective courses are rated mpre favorably than 
required courses. The autfior suggests that although the SOST is valid 
and r.easonably «^ble, the instrament should not be adopted in its 
present fori without ^revision and further testing (M¥) 



Documents acquired by ERIC tncltide man^ infoimal unpubb*shed materials not available from other spuroes, ERIC m'alces ey^ery 
effort to obtain the best copy available. Nevertheless, items of marginal reproducibility are often encountered and this affects the 
quality of the microfiche and^hardcopy reproauctioris ERIC mako^ available via the ERIC Document Reproduction Service (EDRS). 
\ not responsible for the quality of the original document Reproductions supplied by EDRS are the best that can be made from 

ERlCnai • . 



FIELD TEST AND VALIDATION OF A * 
TEACHrNS EVALUATION INSTRUMENT: , 
THE STUDENT" OPINION SURVEY OF TEACHING 



A REPORT I ' 

SUBMITTED TO THE SENATE COMMITTEE 
FqfR TEACHING AND LEARNING 
FACULTY SENATE 
(UNIVERSITY OF WINDSOR 
WINDSOR. ONTARIO 



JOEL J. MINTZES 
ASSISTANT. PROFESSOR 
DEPARTMENT OF BIOLOGY 



With thfe Assistance and Cooperation of: 



LAUREL BROWN 
DORIS COMPTON • 
DAVID V. REYNOLDS 
DEPARTMENT OF PSYCHOLOGY 



UNIVERSITY OF WINDSOR » 
WINDSOR. ONTARIO 
1976-77 



t Due A . 



This research was supported by a grant from the Ontario Universities Program 
for Instructional Development' (QUPID) through "the.Unlvepslty of Windsor Office 
of Learning-Teaching Oevelopmerit (Profiessor W. Rgroanbw, Coordinator). 



TABLE OF CONTENTS \ 



Sfection 

L. INTRODUCTION 



II. 




T.l 
1.2 
1.3 
1.4 



Objectives ' ' 
Background 

... 

The Problein 
Experimental Design 
1.4i^ Subjects 
» 1.42 Data Collection ' 

1.^3 Analyses * 
1.5 Organization of Report 

RELIABILITY AND VALIDITY OF TEACHING 

EVALUATION INSTRUMENTS 



2.1 Reliability 

2.11 Internal Consistency Studies 

2.12 Stability Studies . , 

2.2 Validity 

2.21 Validity Studies: Overview 

2.22 Student Ratings and Ratings of Others 

2.23 -Students Ratings and Achievement 

• 2.24 Construct Validity: Factor Analysis 
2.25 Effect of Student Variables on Ratings 
' . 2.26 Effect pf Instructor Variables on Ratings 
2.27 Effect of Class Variables qj\ Ratings 



1 

1 



4 

7 



HI- '. RELIABILITY AND VALIDITY OF THE SOs/ 

3.1 Internal Consistency of the SOST - / 

3.2 Stability .of the. SOST ' 

^ 3.3 Criterionrrelated validity} Relationships between SOST 
/ Ratings & Student Achievement 

3. 4 Factor Analysis of SOST 

'3.5' Th^ Effect of ituden.t Variables on SOST Ratings 



8 

n 

14 

» 17 
19 
23 
25 
27 " 
27 
31 
33 

36- 

36 
3B 

39 ' 

44 
4^7 



V ' ' ' ' ' ' Paae il 

HI- ^ RELIABILITY AND VALIDITY OF T HF 'SOST 

3.51 Student's Major * ' 47 

31.52 Student's L6vel 43 

3.53 Student's Performance ' • , . 51 ' 

3.54 Course Status (Compulsory/Elective) . 53 

3.55 Stxjdent's Effort 
- 3.6 The Effect' of Instructor Variables on SO§T Ratings 

3.61 . Instructor's Rank 
3.-6^ Instructor'^ Sex 
3.7 J{)e Effect of Clai Variable/^ SOST Ra^ttngs ^ 60 

3.71 Class Size V 

3.72 CJass Meeting Ti 

iV. SUMMARY. CONCLUSIONS AND RECOMMENDATIONS ' 62 




53 
56 
56 
57 



60 

62 



4.1 Summary , . , « \ ^ ^ , • . " 

4.11 Internal Consistency^ ^' • ' 62 • '< 

4.12 Stabi1i% g2 
4-. 13 Relationship Between Ratings and Student Achievement 64 . 

4.14 ' Factor Analysis - 64 ' ' 

4.15 Effect 'of Student Variables on Ratings .65 

4.16 Effect of Instructor Variables, on Ratings $5 ; <^' 

4.17 Effect of-iTTass Variab>#s on Ratiiigs 65 

4.2 Conclusions and" Reqcninendations ' 66 ' 
BIBLIOGRAPH/ . \ " . . gg 

•APPENDIX A. * The Student Opinion Survey of rTeachinq (SOST) «wid .77 
Intercorreiatlons Bas^d on. 2229 Student Responses 

APPENDIX B. FoUow-up LeUer ^. ^] 

APPEND;x G. "Normative" Data Based on h Classes - - 83 

APPENDIX D. Other Te^chir>g Eval.uation Fnstruments and Correlations 85 

with SDST . * 



LIST OF' TABLES 




.^1.1 -Summary, of Data by Subject Area 

1.2 Pftfile of fiistructors by Rank and Sex 

1.3 Profile of ,Stu<*ent Raters- * 

1.4 Summary of SOST Data: Means and Standard Deviations ' 
2.r Sources of Test-Score. Variance Classified ' 

2.2 Representative* Studies of Internal Consistency ^ 

2.3 * Representative Studies of Stability 

2.4 Characteristics of tiood Teachinq 
^2.5 Student Ratings and Ratings of Otjhers 

2.6 Student Ratings' and Achievement 

2.7 Factor Analyses of Student Rating Instruments <^ ■ ^ 

2.8 Effedt of Student Variables on Ratings • 

2.9 Effect of Instructor Vari^les on Ratings 

2.10 Effect of C'lass Variables 6n Ratings 

3.1 «Internal Consistency of ti»e SOST 

3.2 Stability of the "SOST 

3.3 Profile of Introductory. Psychology Students 

3.4 Relationships Between- SOST Ratings , and Student Achievement 

3.5 Factor Analysis of SOST . 

3.6 Factc/r Structure (rf ~y)ST 

.3.^ EffeJt of Stii^ent's Major on SOST Ratings 

3.8 Ef fecV-ofStudent ' s Level on SOST Ratings 

3.9 . .Effect of ^ti/dent's Performance on SOST Ratings 

3.10 Effect of Course Status on' SOST Ratings 
3.h- Effect of Studen^ Effort on SOST Ratings 

3.12 tffect of. Instructor's Rank "on SOST, Ratings 

3.13 Efffect of Irrstrojctor's Sex on SOST Ratings . 

3.14 ^/ect 9f- Class Size on SOST Ratings 

3.15 E:ffect of Class Meeting Time on SOST Ratings 




* 



i-. • INTRODUCTION > , ' 

• * » 

. \ 1.1 OB JECTIVES , 

The objectives.pf this study were to assess the reliability and' 
"'''^''^ °' ^^^^'^tudenp^in^ of .Teaching (SOSI) an4 to provide" 
• normative d^ta for judging the usefulness of this' instrument in the 
; " evaluation of faculty teaching at the University of Windsor. 

1.2 BACKGROUND 

; At .the January 25. 1975 meeting of the .University of Wirtdsor 
. Faculty Senate, a resolution was passed to establish a special cor^nittee 
...to review th? present practices and procedures for student evalua"- 
tions of teaching performance." This coa«,ittee presented an interim 
' report in December of 1975 (Student Evaluations Cwnmittee). 
' I • In addi.tion to\proposing a University^policy on teaching evalu- 
ation, the Senate Jtudent Evaluation^ Coa^iit tee devoted a considerable 
. amount of time. and energy to the developmelH of a survey, instrument for ' 
e iciting student opinion of teaching performance. In developing the 

(SOST). t^e Con^ittee examined and an- 
lyaed a larg number of similar questionnaires used by both Canadian and 
American universities. 

In its [(ecemb6r report, the Committee recommended: 
' (a) that the 
' Facult'ie 
(b)' 



SOST be. adopted for' University-widfevaluations by all 
li and Departments; . 

that a sjudent commUtee be charged with cqordinatir^g survey 
, activities, validating, and updating the instrutnent. and interp- 
reting thfe;data; . ^ ' . 

(cr that the Results of faculty, evafua'tions be made avaiUbAto • 

instructors, students, p^pomotion and tenure corrmittees. Jd 
^ ^ University administration. . , ' , . 

■,\, 113 JWE PROBLEM . ' - ' 
. FollQwing the December Lort. corjrem expressed by a nutter 
of individuals that the immedi^e adopttt and UBiversity-wide dissem- 
ination of SOSI results vS^^iTdbe inappropriate a>id even 'negligent until 



,the Instrument had been fiild-tested and, if necessary,' revised. It 
Was felt that the gathering and dissemination of data ..which could • 
directly affect th^ promotion and tenure of large rvwribers of faculty, 
members should be based on an instrument of known reliability and 
validity. Furthermore,. some felt the. field-test and^alidqtion should 
be,conduct6d by unbiased individuals who had no part in the develo'pment 
of the instrument. ' . ' • . > 

The present study, supported by funds from the Ontario Universities 
Pr-tJ^^agime for Instruetional Development through the Office of Learning- 
Teaching DevelopmvJt. exam^ed the reliability and validity of the SQST 
(Appendix A). Reconnendations for revision' are included in Section IV 
of this report. ^ 

1.4* EXPERIMENTAL t)ESIGN . ' ' 

This section describes the general experimental design employed 
in the study. Detailed Descriptions of subjects, <^U collectipn. and 
analytic procedures are included in Section III.- 

1.41 Subjects • . ^ 

The total data pool represents the r-esponses of 2229 students who 
were enrolled in 93 classes taught by 53 .instructors in 12 academic 
disciplines. Although an attempt was jiiade to sample student opinion 
from a wide range of subject areas, the data do not necessarily 
repre'sent a random cross-camj)us sampl^f students, courses or 'instVuc- 
tors. .1 - ^ 

Participation in the study by instructors was voluntary. In most 
cases individual instructors were contacted verbally and consent was ' 
obtained b/ a fol.low-up explanatgry letter (Appehdix B). In'a few cases 
department heads and deans were asked to approac^ individual faculty 
rtiemJwrs on a voluntary basis. • • 

* 

. Table 1.1 surwrizes the"" data pool\ indicating the number of 
student responses, instructors and'.cl^sses by subjec't area. Approxi- ' 
mateW,on#-half of the responses were obtained among students enrolled 
in ^ol^ffai and Psychology courses. These two departments also accounted 
for over ^0% of the instructors (30/53) and classes {W/93). 



^ 1 ^ . ■ 

i 

« . * / . 3 

Table 1.2 presents a profile of instructors by rank and sex. 
The category "other", whiph includes, Lrffcturers., Instr^ctorslnd - 
Teaching Assistants, accounted for almost 50% of the instructors * 
by rank. The fewest ratings were obtained among Full Professors 
(lOX). Approximately 60X of the Instructoi^s were males; 40X were 
fpmales. 



Table 1.1 SUMMARY OF DATA BY SUBJECT AREA* 



Subject Area 


Instructors 


Classes 


Student Responses 


15 lo 1 ogy 


' — 1^ 

• 13 


36 


670 


Business Administration . 


1 


2 


■ 97 • 


Chemistry 


1 


1 


79 " 


uQuca 1 1 on 


z 


2 


32 


Engineering 


1 


1 


9 


Geol ogy 


1 




59 


Germanic & Slavic Studies 


4 ■ 


/ 9 


^ 106 


Mathematics 


, 1 


2 


105 


Nursing 


8 


9 


265 


Philosophy 


• 2 


2 


29 


Psychology 


17 


25 


• 622 


^Sociology & Anthropology 


2 

> 


2 


156 


TOTALS 


53 


93 


2229 



Table 1.2 PROFILE OF INSTRUCTORS BY RANK AND SEX 



Rank 


Sex 


Total 






Male 


Female 




Professor 




4 






1 • 


, 5 


Associate Professor 




9 






2 


• 11 


Assistant Professor 




8 






4 


12' 


0ther 




10 • 






15 


25 


TOTALS : • " 


y 


, ^31 






22 


53 



i 



A profile' of the student raters is given in.Tabl'e 1.3.'-. About 
65% of the students in tfi6 sample' were -Science-.and Mathematics or * ' 
• Social Science Majors. These students were fairly evenly distributed %. 
between Honours and fieneral prograrmes. Over 50% of the-stud^nts were. e>lro] led i 
. in their first year of university. '~ ' ■ . - ' ^ 

1 (^2 Data" Con lection , . - 

Two part-time research assistants were employed to help In the * * 
data collection and organization. Th« general procedure w^ as follows: ' 
* a) the assistant arrived at the agreed-upon (instructor-selected) • , _ ' • 

time and the instructor was asked to leave the rootju 
b) the students were asked to cooperate in the evaTi^ation of tlje 

Instructor but were not informed that responses, would be' used 

for research purposes. ' ' , 

\c) each student received I copy of the instrument (MST) and a ' I 

^ multiple choice standard response -form for recordiiig h^s/her ' 
evaluations. 

d) students were asked to .indicate^the course humber and the 

^ ■ instructor's name on the response form but to omit their own 
name and student number. . ^ 

e) depending on class size, the students were permitted 10 to 15 
minutes to complete the evaluation. The response forms and 
'instruments were then collected and the students were thanked" 
for their cooperation. * 

f) the completed response forms wire oijttcally scanned and the data 
transferred to standard data processing cards. 

An evaluations were completed during the last 4 weeks of the 
Fall and Spring terms. 1976-77. Table 1.4 presents a surm^ary of the 
data, giving means and standard deviations for each of the evaluative 
Items (9-28). This summary is'bas.ed on the entire data pool (2229- 
responses) regardless of class or instructor. ) 
1.43 Analyses \ 

^ All analyses were performed with the aid of the Uriiversity IBM 

J 



ERIC 



f 



Tab^ 1.3 PROFILE OF STUDENT RATERS* 





■ » ~ ' 

L , - s 


\ . . Item 


J — 

Frequent ies ' 


A 




f 


> .° ^ 


1 E 


1 : My major is in: 


Arts 


Soc. Sci; 


X Sc-i . •& Math 


Bus. 


Other 


13. n 


■ 24.6% 


40.3% 


8.9S 




2. This course is part 
of my: ' 


Hon . Pam 


Gpn Pnm 


- 




• 

•i 


52. n . 


47.9% 


3. I have completed the . 
• following number of \ 
University-level full ' 
courses: 


0-2 


3-7 


8-12 


13-17 ' 


18- 


5a. ox 


'15 it 


4 


t% Pa/ 

8; 5% 


^3.7t 


4. Rating myself against 
the performance of ^ 
other srtudents in the 
class, r see nlyself in- 
one of the following 
groups: 

" 1 


Superior 


Above Avq, ^ 


Average 


Below Avq. 


Failing ^ 


4.9% 


38 .^X 


■ \ « 
49.5% 


, 6.051 

*- 


\ 

.1.0% 


5. This course we|.s com- 
pulsory. 


Yes 


No 


nu u ourc 






51.7% 


44.2% 


4.1% 


6. My Attenrfance and, Punc- 
tuality have been con-, 
sisteqtly good. 


Yes 


1 - 

.No 




t « 

O 


T 


91.1% , 


8.9% 




Excellent ^ 


Above Avg. 


Average 


Below Avg. 


Poor 


7.. Compared to other 
courses I have taken, 
I consider niy effort 
in this course to 
have been: 


• 10.3% 


»-, 

39.9% 


41.6% 


6^9% 


1.2% 


, * 


Yes . 


•No . 


-1 


\ 


i . 


8. I have found the mat- 
erial in this course 
t to be inherently diff- 
• cult • 

o . ^ ■ 


29.3% . 


70^7%, ' 



ERIC • Ah 



Table 1:4. SU WARY OF SOST- DATA: 

MEANS & STANDARD pEVIATIONS^"*^ 



ITEM 

At Iml 1 




MEAN 


— = 

Standard deviation 

^ — — 


1 


9 




. 1.900 


0 91? 




10 


• • 


2.244 


I . 1 nfin ' k 




. 11 




, 3:756 






1? 


• 


■ 1.869 


0 855 




.13 




1.949 






14. 
15 


• 


1.4^1 
1.848 


0.668^ 

0 QO? 




— ^6. ■ 




f:627 






17 




2.009'; 


0.373 » 1 








2.353 : 


1.076 \ 




19 


4 


3.23^6, 


, 1.147 


- - 


20 




,2.778 


1.022 


• 


2T 




2.464 


0.976 




22 




2.654 ' 


0 917 




23* 




' 2.453 


'1.016 




24 




3.653 


0.988 - * 
0.760 




25 




, 3.513 




26 




3.518 


0.966 




27 




2.808 


^,1.227 




28* 




2.436 


0.976 





1 Based on 2229 student' responses' in 93 class sections taught by 
53 instructors in 12 academic disciplines. 

2 Responses were' c«ded ai follows: A = 1, B '= 2, f =,3, D ' 
and E = 5. . * ■ ' 



360/65 computing faeility usin'g .subprograms bf -the Statistical Package .* • 
for th^ Social- Sciences* (S,fl5S]>iri<|.tlie-St Ana 1^ is* System " ' 

(SAS). The following ts a general outline of the analyses:," 



Normative Data .(Appendix t). These arfe/descrvptive dat^ including 
means and standard deviation? for each of th(» items on the instr'ument/ ' 
These data differ from those in.Table ,>.'4 in that all analyses wpre-- 
, ba^ed on class means. These data permit individual instructors to. 
compare their own class evaJuatiiJns with "average" ratings- obtained ' " ' 
in 93 other classes*. ^ < • " ' ' ' ^ 

"^^^'^^^^^^y : Both^ internal ftns is tency and stability were examined. 
The interna] consistency of each subsAle was ^'timated using Cnonbach's 
alpha coeffi-cient. Stability was assessed by the t^s't-retest procedure 
with intervals of 7. 14, 21, and 28 days. 

y-^^^'^^^^ - '^^^ construct validity of the instrutftent was examined, 
by factor analysis. In addition, a series of analyses of variance 
were performed .to determine whether sfadent responses werfr systematically 
biased by irrelev^t factors (student characteristic?; instructor char- 
acteriitics; class\haracteristics). /Criterion-related Validity was' ^' 
"assessed by examining relationships between student ratings and objective ^ 
measures, of student achievement. .Finalk correlations were obtained ' 
between each of t*ie SOSI itelns and each \tem pa a nu3,er of othpr widely- 
used teaching evaluation instruments. ■ • „ 

' "* , 

* 1.5 - ORGANIZATION Df REPORT 

Section Ijl of this report presents a'*'<brief r/jyiew of representative 
studies on the reliability and validity of teaching evaTuatton instru- 
ments. Section IHVeports the reliability arwl. validity of. th*|gST * ' 
and Section IV gives a sumrtary of the^f indirtgs . presents the conclusions, 
and forwards several recommendations* concerning the development and" use 
of teachfng^'evaluation instruments at the University bf Windsor. -'' 



. " IK ■ . . 

. . 2.1 PELIABILITY ' • ' ^ * •• 

^ Two terms, that are often used to descr1b*»,e meaning of reTIa- 
. b llty are "pr^ctslon" and "consistent." m test and measurement 

heory reliability is an estimate of the extent to .^ich ."dlfferetf^s . 
. in (observed) test scores are att;r1butable to 'true' di1ff^<»^ii# ■ . ' 

the Characteristics under consideration and the' exten^WwIlfl^ V - 
are. attributable to ^chance error's (Anastasl. I9i6). Said another . 
way. the reliability of a test. Is a measure. which allows u^-'to " : " , 
estimate what proportion of observed test score Variance .is error 
variance . . ^ - • ' * • . — ' ' » 

Gulllksen (1965) -defines .these relattonships concisely Ts follows:^. • . • ' 

.'1,1 ^ • * t 

Where: X. = thfe observed score of the 1^*^ person * '. ' ■ 

> V ^ the "true" score of the i'*^' person ' ' " 

E^- = the error component for the same person' ' ^' ; ' 

■ It is apparent then, ^hat the observed score fpr any individual is ' • 
a composite of th^ individual's "true" score and an erro;" factor " 
Furthermore, the variance of observed scores in a population (S^ ) ts ' 

7^^Z^l ''''' . th. ... 

reliability coefficient is the ratio: • 

• • -. . • ^ ' • - 



V 



XX = ^ T 



. An extremely^ important question is. "what factors add to the er^^ 
^ variance thereby affecting the reliability of a given test?'' To-help " 
answer, this ^i^ion Thorndike (1949) and Cronbach (1970) have classi-' ' 
fled, the sources of test score variance (Table 2.1). The sources o'f 
^variance include: (1) jastiM-- aenerai characteristics such as 
^eadipg and priiblem solving abirities. / (2) lasting - specific 
characteristics such^a. knowledge of specific test- questions. '[3)' 
t^m^p^ Characteristics such as health.' fatigue and n^ti'vation. 



ERIC 



13 



Ta^. } SOURCES 'df TEST- SCORE VARIANCE CLASSIFIED* 



I. • Lasting and general characteristics of the individual. 

m . 1. General skills (e.g. , reading) , 

" • ftZil to- comprehend 'instructions, testwiseness, 

techniques. of taking tests • , 

I' ml%T''''^'^. °^ general>type presented in : 

4. Attitudes, emotional reactions,, or habits generally ooer- 

n. Lasting and specific characteristics of the individual. 

^' t^rJlst' /"equired by par^ticular problems in 

JlJi?"*^?^* f°^^9nal reactions, or habits related to 

?n i nH^r '^l'""^^' ^^-S" °^ high places broiight 
^ to^mind by an inquiry about such fears on a personality 

V 

111. .Temporary and generjl characteristic's af the iniividual" 
•1. Health, fatfgue, and' emotional strain 

U. Motivation, rapport with examiner ^ v 

3. Effects of heat, light, ve'atilation, etc. • • 

' ' type °' P"""^^^^'" 0" skills required by tests, pf this 

aSpJ.nJl"'?^':-'*^^^'' ''^^ departures from the person's 
average or lasting characteristics — e.g.. political 
attitudes during an election campaign)' * P^^'"' 

Temporary and specifi^' characteristics of the indi\iduaK 
. 1. Changes in fatigue or motivation developed by this partfc?lar 

SlrticSif^iSeT""'""' """^'^"^ ''''''' °" ' 
^' judgment'""' attention, coordination, or standards of 
3. Fluctuations in memory for particular facts ^ 

n^rtl°iV'r^I'/ °" "^''"^ l^nowledge required by this 
particular test (e..g.. effects of special coaching) 

5. Temporary emotional states, str^ength of habits, etc., related • 

^ aV^e^Jl^-Slrd^elV''"'' ' '""'^'^ ""^ ^° ^ 

6. Luck in the" se lection of answers -by "guessing" 

*After R.L. Thopdike. 1949. p. 73 and L.J. Cronbach. 1970. p. 175'. 



IV 



r • * • 

10 



and (4) tempora r y '- specific characteristics 'such.as fluctuation's 'in 
attention, coordination and memory for specific facts. 
• .The Vasting.general characteristics 'affect the "true" score ^i- 
aUce andth^«t€mporary.specific characteristics affect the error' 
variance. The iasting-specific characteristics and temporary-general 
characteristics rrtay affect either.varlances depending upon the type 
of reliability betn'g studied. 

: Research on teachVig evaluatiOR instruments has concentrated on 
two -types of reliabili.ty: internal consi.tPn.y ..nH c^^k£^ 
general, internal consistency studies examine the degree" of hom^gen-' 
eity of items' and/6r behaviours sampled by a^ te^t or subscale. Tests 
of internal consistency such as Cronbach's Jlpha (Cronbach. 1970} count 
lasting-.specific and temporary-specific characteristics as error ' 
variance. On the other hand, studies of stability provide an index ' 
of the extent of fluctuation in scores over a specified time interval 
The test-retest procedure, a. commonly employed measure of stability, 
considers temporary-general and temporary-specific chafactlristics as 
sources of error variance.^ . 

-Internal- consistency and stability are independent of eaph other 
An internally consistent instrument may or maynot be stable. A stable 
instrument may or may not be" internally consistent. , 




2-n Internal Consistency Studips 

§A larg|j^umberj)f studies on thfe .internal consistency of student 
evaluatiwr-Tlistrumentshave bgerf reported over the past 25 years. 
Table 2.2 pn v^ides.a fairiy Veprasantajive sample of these studifes. ' 
We will- not ittempt here to evaluate or even discuss each of these 
studjes individually. howe\er. ? number of cautionary statenient^seem 
appropriate,/ ■ - , , ■ • 

Even /casual examination of the research shows a areat deal of- 
variability among. studies.witJi' regat^d to: > X. ;^ 



TrlllA' llllm'"'''' '^'' '° and numbers of 

2) numbers of items p^r subscaJe Or instrument (2 to 140), 



an 



farl^ul ^'hT"^"'" (split-half. Kuder-Richardson , 
formulae, odd-even mea^ns.- Cronbach's alpha^Hoyt and 

. ■ Dreta'^L nfJh^'-'J"' differences may affecLthe inter- 

pretation of the internal consistency coefficients 

, Perhaps the most significant of the above ment%,ed sources of 
variability is the number of items per subsc^k. It has long been 
recognized (Spearman;- .1910 and Br^wn. 1910) that, other things being 
equal, the longer a teU tbe'more reliable (tntern^lly consistent) 
It is.- Therefore, care Shoul^d be Taken when comparing internal 
consistency scores across suj^pcales possessing different numbers 
of items. ' * . 

♦ 

Anpther source of variabiUty "ambng the studies is the unit of ' 
analysis employed. In some Studies the unit of analysis Is i/idiviilual 
students- within a class (example: Wherry. 1951). In other studies i»» 
.is students across classes (example:' Aleami)ni and Spencer. 1973). InJ 
still other studies, the atalytiyjnit is class' means presumably regard- 
less of individual class size fPdhlmann.,1975) . .Here again, becadse of 
di'fferenc«s in expertmental* desi'gn. caution should be exerted wh^n " 
c.omparing gaefficients a'mong studies. 

Wjth- these cautions" in mind, one might'still be impressed with 
the remarkably high internal consistency coefficients reported. The 
coefficients compare quite favorably wUh many psychometric instruments 
including ability -tests and personality inventories, even those developed 
by factor analytic techniques. 



Researcher 



Student 
Raters (N) 



Type of 
Instrument 



Analytic 
Procedure 



Internal 
Consistency 



Wherry (1951) 



Lovell and Haner . 
0 955) \ 

Remmers and 
Welsbrodt (1965) 



Harvey and Barker 

Hlldebrand. /wilson. 
arKyOienst (1971) 

Doyle (1972) 



Aleamonl and 
Spencer (1?73) 



rating past 
better and 
worse inst- 
ructors 



42>> 

46 

46 

47 

44 



105 tn 4 
courses ' 

1908 in 59 
courses 

5^9 male students 
regardless of 
courses ^ 

1015 rating past 
best and worst 
instructors 

379 in 11 courses 



297 regardless of 
courses 



12-item 25-point 
ratings 

140-item>25-point ' 
ratings 

140-item five-point 
^ ratings 

12-item five-point 
ratings 

70 forced-choice 
dyads 

36 forced-qhoice 
tetrads^ 



n ten-point ratings 
(PRSI) 

21 -item ten-point 
ratings 

7*8 item seven-point 
ratings'' 

9-28 item 5-point 
ritinas 
(SOS^) 

50-five-point ratings 
. (CEQb) ^ 

5D-five-point ratings 



St)lit-half , 
Split-half 
Split-half - 
Split-half * 
' Kuder-Richardson 14 
Odd-even rae^hs 

Horst . 

Product-moment ^ 
correlation ^ ^ . 

Alpha 
Hoyt 



Split-half (negative 
versus positive items) 

Split-half (mixed neg- 
ative and positive) 



.88 

.96 • 

.98 . - 

.88 . " 

.79 corrected to 
.88 

« 

.67-. 91 
.38-93 



.80: 



i. 



.90-. 96 



.85^orrected to 

•87 corrected to 
.93 



Jcontilfiued 



Table Z.Z REPRESENTATIVE STUDIES^OP lNT€ftNAL CONSISTENCY* 



Refe 



earcher 



Student 
Raters (N) 



' Type of 
Instrument i 



Analytic 

.Procedure 



Internal 
Consistency 



PohTmann (1975) 



^ ? in 16 courses 



94-571 i 
courses 



35', 000 in/]-, 279 
courses/ 



5p-five-point • 

8-iq. ifem CEq'' 
sujbscales 

21 -item 5 -point 
, ratiijgs ; 



Kuder-Richardson 21 

Kuder-Richardson 2fl 

Product-moment . 
^ cprrelatibn 



" .93 average 
.40-'. 92, 



/ ■ ■• ■ . 

*Purdue Rating .Scale for Instructors 
'^Illinois Course Evaluation Questionnaire 
"Minnesbta Student Opinion Survey 



♦Modified and Updated after'Doyle (1975) 




."1 



ERIC 



rt s^emsfthen that many wideVy^ 
ments ido possess relatively highXi^l 
averaging approximately .7 -to /9. 

2.12 Stability Studies ' 



>ei teaching evaluation instru- 
Jna/l ^Gons ist€fhcy. wi th coef f i c i ents. 



^ . Research qn the s.tability o 
expended over a period of 50 year; b^gi 
and his associates (1927) oVi the Purdu 
Although a considerable number 6f 



educators and psychometricians. 



stjudpnt evaljiatioD instruments has 
fJning with the work of Reinnjers 
Rating Scale for'instructors : 



stiidieshave been, reported, the use- 
fulness of this ki-nd of fnformatijon W .been questioned by several 



Says boyie (1975): 



• , . While it iould be tm()ortaht , . 

to know the extfent to which 
rarings-.. change over time as 
X a function of Handom or sys- 
. tematic Hater, task, 'and stt-' 

. uational ifactoits as' distin- 
't • . ■ ' guished from instructor and 

course factors," the typical • * ■ 

retest study is only margin-, 
ally adequate to the tffsk, " 
" ■ given that instructor Changes 

- . are uncontrolled and trait ^ 
. ' differences , usually unexamined. 

• • * * 

I . Researchers conducting stability studies have responded that 
teaching behavi6urs tend to remain stable over short periods of time 
ey^n, for example, when instructors are given feedback by way of 
student ^evaluations (Murray, 1973). Therefore, retest studies are 
helpful in identifying the extent of a major source of error variance 
that attributable to fluctuations in student characteristics.. 

•• As with studies of .internarconsistency, the design t)f stability 
studies -varies considerably. The major differences among studies are: 
number of student raters, .type and number of items, and very impor- 
tantly, the time interval between initial- test and retest. (in general, 
the stability of an instrument decreases with increasing time intervals 
(Anastasi, 1976)). • ' 

. All of the studies summarized in Table 2.3 with the exception of 
Bausell et. al. (1975) examined stability within individOiT courses over 
relatively short periods of tiw (3 days to one semester). It is 
probable that the majority of the variance in stability coefficients ' ' 
among th«se studies is attributable to differences in the 'items and 



ERIC 



. . 15 

numbers of student raters.- (cr'ltical ^r .values decrease with increasing 
• N). In general, these studies indicate that student evaluation i/istru- ' 
.oients possess moderate to high-stability over short time intervals.' ' 

Another objection to stability studies is that voiced by KuUk , 
and McKeachie (1975):. 

For educational administrators and 
• researchers, the meaning of these 
reliability coefficients, is limited. 
Reliability coefficients for indi- 
vidual student ratings reflect the • 
degree of consistency of students. *» 
.Most educational administrators and ' 
researchers are concerned with the 
consistency of teachers and therefore' 
3re more concerned with the. reliabil'jty 
' . of class ratings of instructors. 

^ The study by Bausell et. aV. (1975) is particularly interesting in 
that it addresses the problems posed by Kulik and McKeachie. Rather 
than using .fndividual student ratings, class means were computed. 
Class means for "same course - same- instructor" were correlated across 
time intervals ranging from one semester to two, years; In essence, then,', 
this study examified the combined stability of teaching behavioijrs and' 
student ratings. The data are confounded, however, because raters changed 
with cjass enrollment? 

Nonetheless, several interesting conclusions might be drawn by ' 
.tomparing Bausell's mean stability coefficients (.64, .78 and .35) with 
those from within individual courses Uable 2.3).. It appears that the 
two are not strikingly dissimilar. Xhis would argue that the majority 
of§he variability, in student rattngs can be attributed to the rater 
and that teach)% behaviours remain fai rl^' stable eljjhDver longer' 
periods of tirtie. . ■ 

Costin, Greenough and Menges. (1971 ) summarize their review of 
reliability studies in the following way:' yf 

It would appear, then, that students 
can rate classroom instruction y^ith 
a reasonable degree of reliability. 
In particular, the 'evidence cited 
, , concerning tfie stabi*lity of students' ' ^ 

> ratings argues against the cont^tion 

that' student opinions of 

instruction are difficult to interpret- 
since they might •J)e made after a, par- 
ticularly good or bad atypical exper- 
ience (e.g., a lecture). * - 

♦ 



Table 2.3 REPRESENTATIVE STUDIES^ OF STABILITY* 




Remmers and 
Brandenburg 
(1927) 

■ Root ^^^931 ) 

Lovell and Haner 
' (.1955) 

Cost in (ISTeS) 



Kooker (1968) 



C^tin (1971) 

\- 

Kohlan (1973) 



Bausell et al 
(1975) 



30-33 in 3 cours€S 



200 in one dlairse 
105 in 4 courses^ 

Unreported number, 
mostly In sections 
of one large course 

92 In 4 sections 



Sdme 

219 of 11 instruc- 
tors ^, " 



271 In eight classes 



41 courses 
39 courses 
37 courses 



10-ten-^ point 
ratings 
(PRSr) 

50 item checklist 

36 force-^choice' 
tetrads ' 

5 sub-scales, 3-5. 
five-point rjitings 
each 

7 subscales, 7-14 
five-point#atings^ , 
. each 

Total scor^es 

4 subscates, 4-7 

five-point ratings 

each (factor scores) 
f 

3 subscales, 4-7 
five-point ratings 
-each \ 

Single general five- 
point ratings 

14 5-point ratings 

12 5-point ratings 

10 5-point ratings 



3 days 

■( 

4 weeks 
2 weeks 

Mid -semester 
to end of 
semester 

2 w6ek6 



2 weeks 
2 weeks 



2nd day of semester 
to last week of sem- 
ester 

Fall 1968 td Spring 
.1969 

Fall 1972 to Fall 
1973 



.95 

, .89 

.41-. 87 

.58-. 87^ 

.91 
.67-. 77 



2nd day Q;f semester .55-. 70 

to. last w6ek of 

semester 



.58 



.31-. 79 (5=. 64) 
.64-. 87 (x=..78) 



Spring 1973 to Fall .23-. 80 (x=.65) 
1973 



ERIC 



2hir6ue Rating Scale for Instructors 



*Modified ^nd Updated after Doy,le (1^75) 



2'i 



. 17 • 

- ' ■ . l.Z VALIDITY 

• * 

In general, the validity of a psychometriV instrument is 'an 
assessment of what an instrument .measures and how well it measures 
it. However, these criteria 'are not sufficiently specific to convey , ' 
~7the entire meaning of validity. * . ^ ' 

According to Standards fo r Educational and PsvchoOojical Testr 
and fenui^s ft974)..a joint effort of ttie AmeVican Psydhdlogical ' • 
Association, American Educational Research Association and the National 
Council on Measurement in Education. Validity information indicates 
the degree to i«fiich the test is capable of achieving .gertaiV aims. " 
The aims -of testing may be: ' 

^■"1/ .... to determine how 'an individual performs at present in 
^ , a universe of situations that the test situation is claimed 
to represent. (ex: school achievement tests) 

' ^ to forecast an individual's future standing or to 

estimate an individual's -present standing on some variable ' 
of particular significance that is different from the test, 
(ex: scholastic aptitude. tests) 

3. .... to infer the degree to which the individual possesses • 
some hypothetical trait or quality (construct) . presumed to 
be reflected in the test performance, (ex: personality 
tests) / . ' . 

Based on these "aims 8f :testinq'' three types of validity have ' " 
^been recognized: content validity, criterion-related validity, and ' . 
construct validity. 

Content validity is a measure of the extent to which th.e instru- 
ment samples the universe of behaviours about which generalizations are 
to be. made. This type of valid4ty is important in scholastic and 
vocational tests of knowledge or specific skills. For example, a 
vali-d test of arithmetic ability'would contain a representative 
sample of addition, subtraction, multiplication and division problems 
at varying levels of difficulty and abstraction. 

In order to e)^amine the content validity of a test one must 
first do a. systematic arialysis of the behaviour domain covered by 



I- 

18 



the test. Subsequently, one .would examine the number and level's of ' .' 
specific test items to insure proportional representation and'com- 
pleten^ss of coverage. ' . 4 . ' 

. ' . Criterion-related yaltdity ts a meas^ire of how well a test ' 4 

. predicts an individual's behavio.ur in a given set of circumstances. 
Usually it is dewnstrated by correlating test scores with some 
ind.ependent.vasure of performance. F.or example, the Madical ' " 
College Admissions Test (MCAT) is designed to predkt academic 
success in medical school. To the extent that it does so with 
some degree of accuracy, it may be said to have criterion-related ' ' " 
validitjf. ' \ ^. 

The' APA -Standards identifies two types of criterion-related^ 
■ validity Which differ in the timefinterval between test administration - 
; and criterion measurement. If the two are separated by,a reasonable. 

period of time, the measure of association i,etween them is referred " . 
^ to as "predictive" validity. This ki.,d of information is most use- 
ful in making decisions such as personnel hiring or. classification 
and student admissions or placement. 

If a test is administered to an individual or group of individuals 
on Whom criterion data is already available, the relationship is 
reiferred to as "concurrent" validity. Knowledge of concurrent validity 
is helpful in t/iterpreting the resXs of tests which examine the 
present status of an individual^ (pati^^t. student, employee) rather -, 
than predicting future status, ■ 

Construct validity is a measure of t,he degre& to whiclr test " ' 
scores reflect some hypothetical «r theoretical trait or factor. 
Ex»fnples of such fac'tors (constructs) ar^. ;:school motivation", 
"social introversion", "manual dexterity" and -'mathematical aptitude". ' " 

construct validity is especially important in the clinical 
application of personality inventories for diagnosing behavioural" ^ ' 
and emotional disorders. A number of these tests, including the • < 
Minnesota Multiphasic Personality Inventory (MMPI) and the California 
Psychological Inventory (CPI). are purported. to have a. high. degree 
of QonstrBcfvalidity. • ' . 

Several procedures are commonly used to assess the construct 
validity of a, particular test. The most important of these are correlations 



19 



^ With other. tests which measure the same construct, factor anaTysi^ 
« • Factor ^anaiysis is a /airly s6|>histicated analytic, procedure 
that is often useful in Tdentif/ingi^pnmonalities ^i^ behavioural data'. 

. The gpal of factor. analysis i-s to s^^iVheth^r ^ome underlying pattern 
of relationships exists ^uch that thedata may be re-arranged or 
reduced to a smaller sefof factors. This is accomplished* by analyzing 
all possible correl>tjon5 among items, arouping ccfNnon items. together, 
and assigning appropriate Ixjadin^s (weflt»tsMo eac*h." The clusters 
^so formed may represent unique factors or con^ructs. ' . - 
2.21 Validity Studies': Overview . '. ' 

"The validation of a teaching evaluation instrument is an 
especially difficult ta/k when compared to the validation' of other 
test and measurement devices. Each type of validity poses special 
problems. ' • , ^- . . 

. Content Validity . Presumably a teaching evaluation in^rument ' 
Wsessing content validity is one that-has a representative number 
(and t;^pe) of items which measure teaching behaviourls that affect ' 
student learning. Unfortunately.' the plain truth of the matter is ' , 
^. tha<rwe know veryjittle about this domain of behavleurs. " In'faVt.- 
. it has bpen suggested by spine (Bu??%-and Capie, T977) that teaching ' 
behaviours j«y account for less than 10% .of the variance irr" student - 
a'chievement. • # . , . 

Another approach .is to tfik students the criteria that the^ 
consider important to tea.ching 'effectiveness and th^ to. examine ' 
.^teacher. rating forms for 'completeness based on th^ke Vrlteria.. 
In fact, this may be ^ better measure of Content validity in that " 
these instruments ^f re designed to' mQ3sure the student's" perception ' , ' 
of teaching effectiven|^s (Table 2.4)! ■ 

* The hmt and most widely used teaching evaluation "instruments ' • 
^^e assumed to R^ess content' validity b^^ause. "as a first, step in 
their constructionStud-ents (and faculty) are often asKed to'list ' ' 
characteristics of particularly, good and/or part^Tarly poor inj^tnic- . 
tors (Hildebrand et. al. T971). The list of :chara*teri sties is then ' 
trKnme^.by data r^ductioh procedures (often factor analysis), and . 
phr^srd into items to produce the final instrument. , • ' ' " - 

\ • Interestingly, the factors most often named as.'?faracteristi*c of 



ERIC 



n 2V 



20 



Table 2.4 CHARACTERISTICS OF GOOD TEACHING* 



Bousfield 



1 



Clinton' 



Fairness 

Mastery of subject 

Interesting presen- 
tation of material 

Wen -organized 
materiil 

' "Clearness of 
exposition 

Interest in 
students 

Helpfulness 

Ability to direct 
discussion 

S^erity 

» Keenness of 
Intellect 

1/ Listed in ^order. 
of Importance, 
by 61 underx 
graduates at . 
University of 

f Connecticut. 



Oeshpande, 
et al3 



French* 



Knowledge of ' 
subject matter 

Pleasing per- , 
sonality 

Neatness in ap- 
pearance and 
work 

Fairness 



Mnd and 
sympathetic 

i 

Keen* sense of 
^ humor 

Interest in 
profession 

Interesting 
presentation . 

Alertness and 
Z"^*^ r oa d m 1 n d e d n e s s 

Knowledge of 
metlTods 

2..L^'$ted in 
order of* 
. importance, 
f by 177 junior- 
years students 
•at Oregon State 
University. 



Motivation 

Rapprfrt 

Structure 

* •> * 

Clarity* 

Content mastery 

Overloatd (too 
much \«)rk^) 

Evaluation 
procedure " 

Use Qf teacMng 
aids- 

Instructional 
skiJJs ,^ 

Teaching styles 

3. Listed in 

' opder af 
importance, 
by 674 under- 
graduates who 

- rated ?2 engi- 
' .neering tea- 
cheVs, 



Interprets ideas 
- clearly 

Develops student 
interest 

Develops skills, 
of. thinking 

Broadens 
interests 

Stresses ijnportant 
materials 

Good p^agogical 
methods 

Motivates to do 
best work 

Knowledge of, 
subject 

^Conveys new 
viewpoints 

Clear explan- 
ations. . ^ 

4. listed in order 
of i-mportance, 
by undergrad- ' 

. v^at'es at the 
Hjniversity of ^ 
Washington ^ 



•Wter R.I. Miller, 1974. 



cont'd 



ERIC 



21 



Ttfble 2.4 CHARACTERISTICS OF GOOD TEACHING* 



.con'td 



Gadzella^ 



Perry 



Knowledge of 
subject 




gue^' 



Hildebrand^ 



Well -prepared for " 
class 



Interest in si^bjeot Sincere interest 

In subject - 



Flexibility 

Well -prepared 

Uses appropriate 
vocabulary 



5. Listed in Order 
; of Importance, 
by 443 ujider- 
graduates at * 
Western Wash- * 
ington State ^ 
College 



Knowledge of * 
y subject 

Effective' teach- 
ing methods 

Tests for under-/ 
standing 

^; 

Fair in evaluation 

Effective commuri^- 
cation 

Encourages indepen- 
dent thought 

I Course organized . 
^ logically 

✓Motivates students' 

, 6. "Listed tn' order 
of importance, 
1493 students, 
faculty, alumni 
at l/niversity 
of Toledo. 



/ 

Knowledge of 
subject 

'Fair evalua- 
tor 

Explains " - 
clearly 



Listed in 
order of 
importance, 
307 stud- 
ents at 
Philander 
Smith 
College. - 



Dynamic and erier- 
gettc person 

Explains clearly 

'Interesting pre- 
sentation 

Enjoys teaching 

Interest in 
students . 

Friend^ Joward 
studen^ 

Encouriages class 
discussion 

Discusses other' 
po flits of view 



8. Listed in order 
of importance^ 
by 338 under- 
graduate and 
graduate stu- 
dents at Uni- 
versity of 
California, , 
Oavis. 



ERLC 



2!j 



22 



good instructors seeOis to be fairly constant.. Thus'mar^ instruments 
have items covering the following areas: > 

a) knowledge of subject 

b) abiflty to pr^s^nt material in a clear and fnteresting marmer 

c) ability to motivate students 

d) course organization • > . / V/ 

e) cgurse twrkToad 
, f) rapport 

g) feedback • 

h) student evaluation (grading procedures) / 
Criterion-related Valjdity., The problem with criterion-related ' 

validity is that there is a 'general lack of agreement on the criteria. 
Most people, however, would concede that a "good teacher" is one who 
facilitates $tudent -learning. As a result, many studies have examined 
relationships betWeen instructor ratings and stukt achievement as 
^he best estimate of crUerion-related validity/ The major problem 
with this, of course, is that many' factors may /ffect learning; teaching 
behaviours constitute-only one of these factor/ (possibly one of the 
least significant factors). These, studies, thL are^conf bunded- by many 
variables not directly related to the teaching behaviours of the instruc- 
tor.- . ■ / 

Another approach has been to examine rLtionships between student 
ratings of instructors and rating^ of the s/me instructors given by 
other individuals. Ift most cases the "othet individuals" are colleagues, 
department chairmen, deans, alumni or paid/observers. In at least one 
study (Centra. 1973) an- attempt was made L find relationships between 
student eatings and self-rattngs by in^tyictors. In any case, mere 
agreeme^nt between students- and other ob^rver? probably constitutes 
a weak 'measure of validity. 

, Construct Validity. The major p/oblem with construct validity is 
that "constructs" concer4iing teaching /effectiveness are rather vague 
and ill-defined. Forlcample, th€ codstruct. "ability to motivate 
students", may «an se/eral things different people or^ even several 



things to the same individual. 



/ 



\ 



ERIC 



-30, 



23 



t 

None-the->ess. two approaches have jgenerany been used to 
assess the construct validity of teacher rating forms. One method 
is to examine correlations between, inttruknts. If two instru- 
nients measure the same corvstructs. corresponding items on the 
instruments should be highly correlated. The second approach, and 
one Which has met with^ reasonable amount* of success, is factor 
analysis. If an instrument possesses cc^nstruct validity, subscale 
factor loadings should be relatively high. 



Perhaps the mast cormK)n complaint voiced by opponents of teaciiing 
evaluations is that they are subjec^^ student bias and that ratings 
are affected by many variables o;;rr yhich the instructor has no control' 
To test these hypotheses studies have examined the effects of a 
number of student, instructor, and^cla.s variables on teacher ratings 

The remainder of this sectioft summarizes the results of represen- • 
tative studies on: . , , ' 

1) Criterion-related Validity: Student Ratings and Ratings of ' 

Others 

2) Student Ratings and Achievement ^ 
. 3) Construct Validity: Factor Analysis 

4J Effect'of Student Variables on Ratings , , • 

■5) Effect of Instructor Variables on Ratings ^ 
6) Effect of Classj/ariables on Ratings 

2 Student Rati ngs and Ratings of Others 

♦ 

Several studies have examined correlations between student ratings ' 
and ratings given by colleagues (Table 2.5). In general, the coefficients 
Obtained havel,een« moderate, averaging .4.,to One study (Murray 1973) 
found a correlation of. 82. 

* Ratings given to teaching assistan'ts in large courses by their 
supervisors also correlate moderately well with student evaluations 
(.49; .62) as do alumni ratings of their former professors (-.5). 

The highest degree of agreement has been found between ratings 
given by paid (Observers and student ratings (.92). Perhaps this study 



ERIC 



Table 2.5 STUDENT RATING J AND RATINGS OF OTHERS 



Stud^ 



i« Cblleaque Ratings 

' Maslow and Zinnenran .(19^) 
Al^amoni and Yimer ' (l^^j 
•Murray - .(J973) 
Centra (1975) 



Ilv Supervisor Rating^ 



, Hayes 
Costin 



(1971) 
(1966) 



/ 



III . Alumni Ra^nqs 

Drucke/anV Remmers (1950) 

IV . P»ld Observer^ Ratings 

lurray (1973) 



(1973) 




V. Self^ratlnqs 
Centra 



Correlation Coe/f1cl6nts 



.30 to .63 

.16 to .30 

.82 

.OOto .54 



.62 
.49 



.40 to .68 



,92 



.21 . 



Is^the jnost significant in that the paid observers attended classes 
^inth students and were therefore in a better position than other .out- 
side raters t9 judge the effectiveness of a given instructor. 
^ Finally, ohe study (Centra; 1973) examined relationships between 
^student ratings and instructors self-ratings. The correflatfjns obtained 
.were low, averaging .21. Inmost cases'-instructors ten.ded, to /ate 
themselves mare favourably 'than their students. 

2.23 Student Ratings Ind Achievement 

Another way of looking at criterion-related Validity is to examine 
relationships between ratings and student achievement. According to 
Murray (1973). 

Students and faculty would agree that the 
ultimate criterion of good- teaching is the, 
extent to which students leisirn, or make ' 
progress toward educational goals. Most / 
rating forms for student evaluation of 
teaching are not intended to provide a 
direct measure of student learning, t)Ut 
they are designed to measure aspects of 
teaching (eg. clarity of presentation) 
that would be expected to have some direct 
or indirect effect upon student learning. 
Thus it is reasonable tb expect some degree ^ 
of positive correlation betv<een student rat- 
ings of teaching and objective measures of 
student achievement. 

Many studies have examined the correlation between ratings and 
student achievement (Table 2.6). In general, the evidence shows a 
weak but positive relationship with correlation coefficients averaging 
about .20 to .30. This indicates that instructors who obtain favourable^ 
ratings are more effective in facilitating learning than Instructors 
who receive less favourable ratings. ^ 

One study by Rodin and Rodin (1972) has received considerable 
attention and caused a certain amount of contjjgversy. the findings Af 
this study show a strong negative relationship (-.75) between ratings 
and achievement. One of the reasons it has caused so much discussion • 
is that the findings wiere reported in Science , the prestigious journal 
of the American Associat^ion for the Advancement of Science. "Nevertheless, 
the results have been severely Criticized on methodological grounds by". . 
several individuals including Frey (1973) whose parallel study (also 



26 



Table 2.6 STUDENT RATINGS AND ACHIEVEMENT 



Study 



Elliott (1950) 
Morsh et. al. (1956) 
McKeachie -etrr^al. (1971) 
•Rodin" and Rodin (1972) 
Frey (1973) 
Gessner (1973) 
Skanes and Sullivan (1974) 
Marsh et. 31.^(1975) 



Correlation Between 
Ratings and Achievement 



. + .?4 
+ ^0 

-.60.to + .72 (x= + .10) 

. . - .75 
+ .14 to + .91 

+ .53 

+ .39 
- .02 to + .55 



ERIC 



j L 



• . . ' 27 

published by ^Science ) shows equally strong but positive cor'relatlohs. ^ 

2.24 Construct Validity: ^factor , Analysis 

A. large number of factor analytic studfes have been performed on 
.student ratings. The studies have generally beenof two types. One 
type of study attempts to extract the overall dimensions or factors 
describing "good teaching" from pools of Statements submitted by 
students and faculty. This type of study has often been performed 
preliminary to the -^development of a new teaching evaluation Instru- 
ment. A second type of study factor analyzes an ex1st1.ng Instrument 
to determine Its factor structure for practical use In a college or 
university setting. ' 

Probably the most Influential factor analytic study was performed 
by Isaacson, McKeachle, Mllholland et. al. (1964). In this study a 
pool of 145 Items describing teachers was reduced to 46 representative 
statements. The 46 Items were then factor analyzed for four separate 
student samples. Six factors emerged and were consistently found with 
two administrations, , in different semesters with different students^ and 
teachers. The 5 factor? were labelled "Skill", "Rapport", "Structure", 
"Overload", "Feedback", and "Evaluation." 

The first four gf these items seem to correspond to similar 
factors emerging from 11 other studies (Table 2.7). It should be noted * 
however that factor labels are 'derived somewhat subjectively and 
therefore similarities may be misleading. It is reqiarkable though to 
observe the amount of agreement in studies spanning a period of 30 
•years. 'It would seem that our basic conception of what constitutes 
"good teaching" has not been altered significantly even with the intro- 
duction of new instructional methods and modern technological advances.^ 

2.25 Effect of Student Variables on Ratings ^ ' 

Over tFifi past 50 years many student variables have been examined 
as possible sources of bias in student ratings. While a number of 
Studies have looked at personality characteristics, the factors-most 
often studied have been demographic in nature including students' sex, 
major, level (year in university)', and course grades (Table 2.8). 

The results of these studies have been quite variable b^ause of 
differences in experimental design and methodological rigpur. Nevertheless, 



ERIC\ . 31 



Table 2.7 FACTOR ANALYSES OF STyPENT RATING INSTRUMENTS 



1 



s 



Study 



Skill 



Rapport 



Structiire 



OverhMfd 



Other 



Smalzreid & Remmers (1943) 

■* 

Creager (1950) 

Bending (1954) 
Gfbb (1955) 

Isaacson et. al. (1964) 
Solomon (1966) 

Turner (1970) 



D^hpande et. al. (1970) 
2nd-order factors 

Hartley 4 Hogan (1972) 



Fre> (T973a) 
Tteldachie'^ Lin (1973) 



Professional Maturity 

Professional Impres- 
sion 

Instructor Competence 
Conmunlcation 

Skill 

Energy vs. Lethargy. 



Exciting, Humorous, 
Stimulating 

Stimulation 



Overall Evaluat^ion 



Teacher' s 
Presentations 

Skill ^ 



Empathy 
Rapport 

Instructor Empathy 

Friendly- 
Democratic 

Rapport 



Lecturing Vs. 
Student Par^ 
ticipation 

Approachable, 
Warm, Cheerful 



Organization 
Structure 



Control vs. 
Permissive- 
ness 

Penetrating, 
Clear, ^Focused 



Academic Emphasis 
Overload 



+ one othe 
factor 



+ two othe 
factors 



Prepared,- Probfng, 
Demanding 



two othei 
factors 



Affective Merit Cognit^fve Merit Stress 



Student-Teacher 
Interaction ' 

Teacher 

* Accessibility 

(Group Inter- 
action) ' 



Structure or 
Organization 

Organization, 
Planning 



Structure 



Load or ^ 
Difficulty 

Work Load 
Difficulty 



+ two othei 
factors 



1 



After J. A. Kulik and W.J, McKeachie, 1975. 

ERIC 



f\5 
00 



•it is possible to draw spm6' tentative conclusions from the wor-k to dat^. 

Although several studies; have found. differeaoes -among ratings of 
male and female studeots (Behdiog, 1952; McKeachie et. al., 1971) the 
weight of evidence from the best designed studles*shows no significant 
diffefencftt Furthrer/llore, thercj^ seems to be no'complfex interaction 
effects between student sex' and instructor sex (Elsmore and Lapoirtte, 
1974)., 

The student's, universi,t^ major seems to have no effect on ratings 
"however university le^'li%ear) has been shown rather cofis.istently to 
«f feet student evaluations. In most studies upper-level and graduate-/ 
Jevel students rate their instructors and courses more favourably than 
lower-level students. 

Perhaps -the most controversial area of student evaluations is the 
effect of course grades or expected cotj^se grades on ratings. IJrrfor- 
tunately, firrdings in .this area have beerKmixed. A substantial number 
of investigations have found signif icai^ and positive relationships 
between, grades and evaluations (Kennedy, 1975), however an equal number 
of studies have reported no such effect.. Costin^Greenough, and Menges 
(1971) summarize the research on grades and ratings as follows: 

Does the evidence, then, support an assertion 
that a teacher can get "good" ratings simply by 

■ assigning "good" grades, or-ereating the expec- 
tancy, that he will do so? The fact that the * 
\ ' positive cDrrelations which were obtained 

between student ratings and grades were typically 
.low weakens this claim as a serjous argument against 
"the validity of student ratings*. The positive 
findings that do occur might better be viewed as 
a partial function of the better achieving student's 
greater interest and motivation, rather than' as a 

' mere contamination of the validity of student 

. ratings. 

Eommenting on Cpstin, Greenough, and Menges conclusions. Kulik 
and McKeachie (1975) cite the work of Elliott (1950) and Morsh and 
Wilder (1954). These findings support the belieif that the relation-" 
«kip of graces to ratings can be best viewed as^the product of a complex 



30 



^'>le 2.8 EFFECT OP STUDtNt VARIABLES OH RATtNRS 



^^''^^ V 






student Variable 


Study 


Effect 




ReiTwers (193^) 


no. significant effect 


Student's Sex 


Bendig (1952) 


% females ra^ed less favourably 




. Rayder (1968) ' 


no significant effect / ' - 




McKeachie et. al 

f 1971 ) 


females rated more favourably 




Elsmore'i Lapointe 
(1974) 


. no significant effeqt ' . 


' Student's major 


Cohen & Humphreys 
(1960) 


no significant effect' 




Rayder fl968) 


no significant effect 


* 

University I'evel (Year) 


'Renmers & Elliott 

t ' 

Gage (1961) . ' 


> 

grad. students rated more^ * 
^ favourably than undergrad. 
students 

students in advanced rr>i>r^p< 
rated more favourably than 
.those in lower level -courses 




Miller (1972) 


upper division courses were 
rated, more, favourably than 
, lower division courses 




Voeks & French (1960) 


*■ ' ■■ <- 

*, 

.no significant effect ^ 


Course Grade 


Remmers (1960). 


no significant effect 


* 


Kennedy (1975) . 

■ 9 ■ 


students receiving 'A' or 'B' 
rated more favourably than 
those receiving 'C or 'D' 



ERIC 



3[j 



student ability-leve] by teacher presentatlou-Tevel interaction"!. 

». ...if the instruc^r teaches for the bright 
students, he wiTl. be approved by thetfr and there 
win be a positive correlation between -ratings 
and grades*; if he teaches for the Weaker students, 
he win be disapproved by the bright students 
and a negative coefficient win be obtained. 
This sort of interaction could explain the , , » 

diverse findings in thi's area reviewed by Costin, 
Greenough, and Menges (1971), who found that 'V 
V some studio report a negative correlation, some- 
a positive correTation, and some a^ nbn-^signifi- 
cant correlation between student ratings and ■ • 
grades. 

2.26. Effect of Instructor Variables on Ratings « ' ' 

^ The effectsj)f instructor rank, sex. and research productivity 
on student* ratings have been studied rather extensively (Table 2.9).. 
With respect to academic ranV firrdyigs are somewhat mixed, however. > 
where significant differences- are found, ratings invaHably favour.' 
senior faculty members (Full ^nd or Associate professors) ' 
over junior faculty membeps. The instructor's sex, seems to ha.ve ^ 
no effect on ratings, ' » 

One topic that seems.to spark cbnJSiderable controversy among, 
faculty memb^ everywhere is the relationship betweep research 
productivity and teaching effectiveness. There are thos^.who plaim / 
that good teaching and good research go hand-in-hand; each comple- 

.menting the other. Others claim that tl)e^wo activities are mutually 
destructive; good teachers Kave l.itUe. time for igood research and 
good researchers have little time f^students and teaching-.' Oag 
position that i s not often heard is that reseiirch^ ability ^and teaching 
ability are essentially independent traits; good teachers nia'y^r may 
not -be good researchers and good researcher's may or may not be good • 
teachers. . . f 

4 

The position that teaching and research are complementary activ- 
itiies is supported in part by the work of Bresler (1968) and McDaniel 
and Feldhusen (1970). Bresler Jfound that faculty members ^ho is^ceive 
more outside funding for research purposes also receiv^more "favourable 
student ratings. Unfortunately. Bresler's finding? were not accompanied 




•32 



Taeie.2.9 EFFECT OF INSTfWtTOR VARIABLES ON RATIN GS 

- - ■ T 



Instructor Variable 



• "Study 



Effect 



"J 



itnstruct^r Rank^ 



1^ 



Oownie (1952) 
•fiage (1961) 



Langy (1966) 



Aleamffni & Yimer 
(1973) 



Full professors rated n»ore 
' favourably than other ranks 

Full and Associate Profesv)»'S 
^ rated more fafjjiurably t-han 
Asst, Professors and 
Instructors 

Qecreasing favourability as 
follows: Aisocf Professors, 
Assistant Professors 
Instrwtors (Full Prof', not 
studied) 

no significant correlation 
betwee.n rank and' student 
ratings 



Instructor Sex 



4^ 



Elliott (1950) 
Lowe1] &'Baner 

(i95s)^ • 

Aleamoni & Yimer ^ 
(1973) ' 

Elsmore & Lapointe 
(1974) 



no isffgnificant effect- 
no significanx effect 

no significant effect* 

no sio^ficant effect 



Research Productivity . ^ 



Voeks (J962) ^ 



eresler (1968) ' 



, McDanieJ & Feldhusen 
' ;(1970> . ^ a. 

^ Hayes (1971) 



^10 relationship between 
^ research pro'duGtivity and 
student ratings 

'Faculty who were more success- 
♦ ^ful it\ receiving outside 
research^ funding redeived 
more favourable ratings 

mixed findings, (see text)^ 

* np^xelatlonship "between ' 
research productivity .and . 
student, ratings ^ 



7" 



• Of 



41 



« « 



33 



by tests of stati^stical significance and varied greatly. from- one academic 
. discipline to^ another, BresJer's research, reported '\u Science , was' 

severely criticized op statistical'and fnethodo logical ground^ by Quereshi 
ft (1968) whose .rebuttal wJs also publtshed in ^ience. , • 

McDaniel and FelJhu^seri studied the relatiorrship of scholarly activity 
^ (as 'measured by: 1) nun2)er of 1st and 2nd authorships of books; 2)' number 
o^Jst and^-^2nd authorsljips of Journal articles; and -3) grant "s-tatus> to ^ 
* studejit ratings. C(frelations were • general ly low, but significant and 

positfive relationships we# found between second authorship of articles ' 
and i-atings.. However, negative relationships *vere found between first 
^iuth\rship of articles .or books ami ratings. Fig^rmore., no- differences 
viere found iff sjdent rftings between 'faculty meflfc wh^held a research 
jCro did not. - . * . • ^ ^ ^' 



grant and those 



, ^ .The work of Voeks (1962) and Hayes (1971) suppor«»thV c6ntent.i«n that 
teaching effectiveness and research productivity are n|jt' related.' In the' 
HAyes study, reiearch product! vVty was me^sixred it) three ways: * 




1) ' piAiyation rtfte (weighted -by type of-publi,6a*tion), ' 

2) grarfi status, and • ^' ^ ' . - i . ' 

3) rating by departnient*'cT)airmari. - „ ^ . " 
TeacJ^fng effectiveness was measured by: ' » Hr' , 

^1) average student ratings over 4 semesters, and . - ' , ' ^ 

. 2) department chainrtan's rating tf ability. - * 
.iipnty'one of the six possible correlations between research and t^a^hing * 
' measures ^as. found to be significant^ - that l^etv«een chairtriSn is fesearch " ^ 
• <l^ating. and chai^mg|s teacfing rating.^ * ' ^ 

In summary, it appears that if teaching effectTCeness^and research' 
I., productivity are rela^ted, "the Telati9nshSp is at best a-wealTarle? Tbe \ , • , 
strongest ev1de>ice seems'tS support ihe' contention .that the two activities * ■ 
are in fact unrelated. * • ' *' 

I 2.27 Effect of Clifts- Variables on Ratihg s ' , ' \ • 

, The effect^ of two' class' variable^ to student' rat ings have been ' ' 

studied extensively' (Tabl^ 2.1ii). These variables, are class size^and * 
course, statv^^ (i.e.: whether tfc*' course .is required "or elective for the / ' 
^Ijority of' the student? endfiled)^ - 

■ - 



3r 



* Table 2.10 EFFECT OF CLASS VARIABLES ON RATIN GS 



Class Variable 



St^dy 



Effect 



itionship; 



Class Size ' 



Gage (1961) 

'Mcb^iel & Feldhusen 
(1971) 

Miller (1972) 
Wood et. a1,qi974) 



AleamoniHl Graham 
.(1974) 

Crittenden et, al. 
.(1975). . 



/ 



LoVjell Haner (1955) 



Course status 
(compulsory/elective) 



Cohen & Humphr'eys 
(1960) 



Millar 0972) 



curvilinear relationship; both 
large and small classes urere 
; rat€d more favourably than 
moderate-size classes 

small classes were rated most 
favourably * ^ 

V*. 

smalt classed were rated most 
favourab] 

curvilinear relationship; both 
largeiaAd small classes were / 
rated wre favourab^ than 
moderatfe^slie classes 



no signi 



ifican^ 



.effect 



small classes were rated most 
favourably 



elective courses were rated^more 
favourably than required courses 



^leiotive courses were rated more 
favourably than required courses 

elective courses were rated more 
^ favourably than required courses 



^ significant effect 



I 




ERIC 



43 



35 



It is- widely believed that student ratings of courses and instructors 
are tnversely related to class s"i2e.* In fact, the strongest evidence 
tends to support this view. -Howfever, a number of studies have reported 
curvilinear relationship in which small and large classes receive equally 
favourable evaluations and moderate-size classes receive significantly 
lower' ratings (Gage. 1961; Wood et. al . 1974). Other studies have 
shown nd class-size effect (Aleamoni and Graham, 1974). 

In the introduction to their paper Crittenden et. al. (1975) discuss 
possible reasons for these inconsistent findiog^^our explanations are 
given. - • « t 

First,- in many studies the sample size (number of classes) is 
.relatively smalVcasting some doubt on the rel iability of the results. 
Second, th^re is no agreeijtent 'imong sWies regarding the oper^tiorval- 
definition of size categories. For example^ the defin4tion of "large" 
has varied .fjom "10 or more" to "200^or more."- Third » it may be that • 
some students alter their expectations of instr'u^tional' performance 
to take into account factors such as class"iize.' Finally, some insti- 
tutions or departments may attempt to- counteract the ^presuraed class-size 
effect by assigning their best instructors aE.4JJ«cating more resources 
to larger classes. * , * 



Crittenden and his associates go^l)n,.to report' the results of aS 
well-designed study consisting of 98.1 qUs^es- at the Uni vers ityfeof ' 
Illinois at Chicago Circle. Th^*s«me, eval ualtf on instrument was 
adminis'tered in all classes and 8 ^e categories were used without ' 
-^ass4^ing labels to them. Class size ranged from under 20 to over 
• 600. ^The findings show a clear' linear relationship in which mfean • . 
S(«^dent ratings decrea-se with increasing class size. 

The resQlts of studies on the Relationship o,f course status ' 
etompulsory/elect^ve) to studjerit ratings are>fairly consistent. Although 
occasiona^l studies- report no signific^t.^eff»ts (Miller, 1972], the- ' 
J weight of evidence supports the view. that elective courses tend to 
rec^'ive. more favourable ratings than reqi^ired or compulsory iourses. 
>Q our knowledge no-^study has' shown ^that students CDns4tent1y favour 
reAuired courses ^ef elective coursps. * ^ • * ' 



ERIC 



. IH. RELIABILITY AND VALIDITY OF THE SOST 

■ ' / ' • . • 

This section presentf^ihe methojls and results of the reliability 

and validity studies of/the SOST. 

3.1 INTERNAL CONSlSTENGY OF THE SOST 

' S£ ftie Internal consistency analyses were basefi -on 2229 student 
.responses without regard to class or Instructor (Students were enrolled 
In 93 class sections taught &y 53 different instructors.) Two-thirds 
.(67.3X) of the respondents were^irst or second year\tudents and the* 
majority (64. 9X) wereSoclal Science or Science and Mathematics majors. 
• See Table 1.3 for afurther description of the students and instructors. 

Analytic Methods . Cronbach's alpha coefficient was calculated 
for 'each of the subscales ("Sections") using SPSS subp'rografh "Relia- ' 
'•bility." Prior to the analyses' several of the Uem scales were reversed 
(iteue 11, 16, 19, 24, 27 and 28) to insure uniform directionality. 

* * 

Results > The Alpha coefficients- are reported in Table 3.1. The 
Alphas range from .19 to .80. Internal consistencies of Sections A, 
.B and C are moderate to j-elatively high and are well within the ranges 
reported for other teaciiing evaluation Instruments (Table 2.2). However 
the Alpha coefficients for Section D (.37) and Section I (.19) are 
unacceptably 'low., ' • . 

An exarrhnation of Secti©|^D ("Feedback") by analysis of vfriance 
procedures showes that nop^gle item. is largely responsible for the 
low. reliability. However, the deletion- of item 2^ Hastructor'^ ' 

expedtations for |tudent performance ") would raise the Alpha 

coefficient to .45. Two ^possible explanations, for, thi^come to mind. 
First,- the item itself seems to^have little relationship to "Feedback" ' 
as- dp the other items, to some extent. Second, the Likert scale 
descriptors ("very low, low, average ....")\are d1ffe>^nt from the 
descriptors ofje remaining 3 items ('"strongly agree, agree, not sure... 

As with ^t1or<D, the. low Internal consistency Itr Section E 
(*Standards")*^s not attributable to any single^tem. Items 26 and 27 s 
to cover course wQfkl^oad whereas Item 28 is an evaluation of the coursj^ 
assignments. ' Deletion of Item 28 would raise the AAipha coefficient 'to.. 



37 



Table 3.1 INTERNAL COKSISTENCV OF THE SOST ^'^ 



Subscale 


r ' 

Cronbach's Alpha . * 




Alpha if ^ 
Item Deleted 






o 






• 


9 








10 ■ 


.70 


Section A 


.78 


' 11, 


.75 






12 


.73 






13 








14 


.79. 














. 15 


.50 


Section B 


* 

.65 

- 


16 


. .72 




J 


\ 17 


41 




■ — r : f 

■ f 


. 18 


• .73 ■ 


Sectior^ C 


.80 , , V 


19 ; 


.76^ 






20 


.74 , 






21 


• / D 












\ 


22 


.24 






_ 23 


.08 


Section D 


. .37 , 


24 


.35 




• 


25 


.45 


Section E 




26 
27 

28' 

. ^ 


. .14 
-.09 
- ,.28 ■ 



Calculation of Alphas- based on 2229 studen't responses without regard to class or 
instructor. 



2; 



Scalings for the following Items^weBe reversed prior to data analysis:' 11, 16. 
19, 24, 27, and 28. V 



.'ERIC 



38 



3.2 STABILITY OF THE SQST ' . * 

Ss The stability analyses were based on the -responses of 435 
students who were enrolled In 25 sections of an Introductory Pjsychology 
course (Psychology ll5a). Although descriptive data for these subjects 
were not analyzed, the students typically represent a wide spectrum 
of interns, motlvftlons ^nd university najors. 

The format of the course requires studehts to ,attend 2 weekly 
meetings- led by a graduate U^ing assistant and 1 weekly presenta- 
tion by a guest lecturer. Student enrollment In sections rai^ged fmn 
18' to 883j*1th an average enroflment of' approximately 36 (35.8). The 
grading procedure Is based on 4 objective mid-tenn examinations (7QX) 
which are the same for all students enrolled In the coJ^se and a 
series of small projects (30%) assigned by Individual section leaders. 

Experimenta l Design and Analytic Methods 

Stability by the test-retest method i^as examined for Intervals 
of 7 days, 14 days,. 21 days, and 28 days. The following data collection 
procldures were employed. ' • , 

^ All students evaluated their section leader'by completing the SOST 
on November 11 or 12 (depending on meeting day). A second evaluation 
was completed according to the following schedule: 
'.Sections 1-6 on ^tovefflber 18 or 19 
Sections 7-12 on November 25 or 26 
Sections 1!-18 on December 2 and 3 
Sections 19.-25 on December 9 and 10 

. Each student was assigned an anonymous, code number. The code 
numbers, which were used In lieu of names or student I.D.s. permitted 
the matching of first and second evaluations by student. Matched pairs- 
were obtained for 435 students. 

Pearson product -moment correlation coefficients were calculated 
\js1ngSPSS subprogram "Pearson Corr." Within Interval groups, all - 
data was pooled and correlations w||^ calculated without regard to 
section Or Instructor. Stabilities were examined for Individual 
items.only. The Ns associated with 7 day, 14 day, 2rday and' 28 day Intervals 
^'eije T29. 108, 87. and 111, respectively. 



C J> 



39 

'^esuUs. Stability coefficients are report;ed in Table 3.2. 
Coefficients ranged from .40 to .77 (7 day interval), .22 to .73 
(14 day interval). .12 to .76 {2rday. interval ) . and* .19 to .68 
(28 day interval). These coefficients are moderate to low but. with 
a few exceptions (asterisks)^, generally within the range reported for 
other instruments (Table 2.3). 
• ♦ It is generally not acceptable to compare unadjusted correlation 
coefficients derived from souVces with Vary4ng sample sizes since the 
significance level of r depends- on N. The calculation of mean stability 
coefficients across several items is also conslderell. by some, to be a 
questionable practice. It is, however, interesting to note that even 
with decreasing Ns the mean stabilit^coefficients decrease as the 
time interval increases (compare, 7 defy mean to 21 day, mean). This ' 
diecrease is probably due tt> both errpr variance associated with the - 
students as well as true changes in the students' perceptions of their 
instructors and the course. 

3.3 ^!li^^^'°^-f^^i!;^°,VA^4; 'TV^ RELATIONSHIPS BFTWFF. SOST RATINr.. 

Ss These analyses were based on the responses of 620 students 
enrolled in 25 sections of an introductory Psychology course (Psychology 
115a). The sample population was characteristically somewhat 
different from the total data pool (Table 3.3). The sample was com- 
po§ed of a relatively larger proportion of Arts and Social Science 
Majors (61.5%). and a smaller proportion of Science and Mathematics 
majors (16.9%). Over 80% of the subjects were first, year students: 
Although the sex of the subjects was not asqertained. enrollment 
figures for the course generally show an equal mix of males and females. 

» Analytic Methods. All analyses were based on section means. The 
mean ratings for each section were calculated using SAS procedure 
."Means." In addition, the mean total achievement scores for each 
section were calculated: The total achievement scores were, expressed 
as percentages and represented the weighted performance on 4 multiple- • 
choice examinations ^(70%) and several "subje(;jtvr^*jects assigned 
by individual section leaders (30%). Mean total achievement scores 
ranged from 69.4 to 79. 7- for the 25 sections. 



Table 3.2 STAB'lLITY OF THE SQStJ'^ 



40 



. Item 



Stability Coefficients 



7 da^s 
(N=129) 



14 days- 
(N=108) 



$1 days 
(N=87) 



28 days 
(N»lll) 



9 

10 
11 

V 

12 
13 

15 
16 
17 
18 
19 
20 

21 

- 22 

23 . 

24 

25 

26 

27 

28 

Mean (all items) 



.50 
.62. 
.40 

.71 
.50 
.77 

.65 

.49 

.63 

.73 

.63 

.63 

.49 

.43 

.65 

.4G« 

.46 

.63 

.^S 

(54 

.58 



.63 

.65 

.51 

.67 

.56 

.39, 

.59 

.44 

.63 

.73 

.54 

.62 

.63 

.48 . 
.50 

.22*** 

.56 

.56 

.31 

.43 

.53 



.60 

.56 

.46 

.7^ 

.•36 

.67 

.24** 

.48 

.42 

.61 ' 

.46 . 

.46 

.58 ■ 

.12* - 

.55 

.46 

.49 

.62 

.54 

.28*** 
.49 



.50 
.61 
.33 
.39 
.40 
.62 
.36 

.23*** 

.27*** 

.50 . 

.35 

.55 

.59 

.19** 

.26*** 

.44 

.47 

.68 

.47 

.35 , 

.43 



1 



Because of differences in N among growps Ps should not be comparef3ieyy:oss 
columns. 

All fs are significant at p< .001 with exception of aster1sksy(***p<: .01 
**P<.05, *p>.05). 



ERIC 



41 

Table 3.3 PROFILE OF -INTRODUCTORY PSYCHOLOGY STUDENTS 

{N»620) 



Item 


Frequencies r-^ 




B 


. C 


D 




1. My major 1s in: % 


Arts 


Sec. Sci. 


Sci. & Math 


Bus^iness " 


Other * 


21« 


40. St 

■ 


,16.9X 

\ 


6.7X 


^^^J^OX 


2. This course Is part of 
my: 


Hon. Prqin 


Gen. Prqm 






* 


' 41. 9X 


58. IX 


-> 

3. I have completed the 
following number cff^ 
University level fufi 
courses : 




3-7 


8-12 


13-17 


18- 


82.4% 


13. IX 


2.8X 


1.61 


t 

" 0.2X 


J. R&tiri myseW against .the 
^Xperformance of other 
studehts in the class, I 
iee myself in one o^ the ' 

fnll/Miiinn nmiinc 




Superior 


Above I^^Q. 


Avecsge 


Below Avq. 


* Fallinq 


J. 7/b 


Jt).DX. 


51 .9% 


7.11 


0.6X 


5. This course was compul- 
sory. 


Yes 


No 


Not Sure 




I 


'40. 7X 


54. SJ 


4.9X 


\ 

6. My attendance and punc- 
tuality have been 
consistently good. 


Yes 


No 










1 u.u* 


7. Compared to other courses 
I have taken, I consider 
my effort in this course 
to have been: 


Excellent 


Above Avq. 


Average 


Below Avq. 


Poor 


8.5% 


40. 3X 

✓ 


42. 3X 


7.6% 


1.3% 




Yes 


• No . 








8. ' I have found the material 
i3r)9^ this course to be 1n- 
tK^ierently difficult. 


24. 2X 


75. 8X 



iili 



42 



Pearson product-moment correlattons were calculated between 
mean SOST ratings and mean total achievement scores across the 25 ■ 
sections using SPSS subprogram "Pearson Corr." 

Results. The correlation coefficients are reported in Tabl$ 
3.4. Eleven (11) of the 20 coefficients are significant at the .05 
level or better. ^This number of significant correlations \% consid- 
erably greater than would be "expected by chance alone. 

Of th6 11 significant correlations, 10 are found among items 
in Sections A-, B'and CV)f the SOST . This would argue that the 
instructor's ability to communicate with and motivate studfents is • 
more important in promoting learning than the assignments, workload 
or evaluation system employed in the course. It further aiigues,that 
good teachers (those who promote learning in their students) receive 
good' evaluations and that poor teachers receive poor evaluations. 

An exapiination of individual coefficients shows that, althou^ 
many are statistically significant, the absolute values are moderate 
to low. Thesff findings are consistent with much previous work on 
the subject (Table 2.6). , 

It will be noted' that tiany of the significant correlations in 
Table 3.4 are negative . Hoviver, an inspection of the item scales 
shows that, where negative correlations are indicated, a low scale 
score (A or B) impl.ies agreement with a generally positive statement. 
Furthermore, where significant correlations are positive , a high scale 
score (D or l\ implie* disagreement with a generally negative state- 
ment. In sum, regard>«5r-ef the direction of the correlation • 
coefficient (+ or -)S^11 significant coefficients imply a positive 
relationship between teaching effectiveness or course structure and 
student achievement^. 

The largest correlation coefficients are associated with Items 
21 ("The instructor was successful in making difficult material 
understandable.".), 18 ("The instructor made this course as interesting 
as the subject matter would allow."), 10 ("The instructor presented 
material in a coherent manner, emphasizing major points and making 
relationships clear.") and 9 ("The instructor is clear and audible.""). 
All. of these items seem to be related to the instructor's general 
ability to communicate, - 



ERIC 



43 



Taljle 3.4 RELATIONSHIPS BETWEEN SOST RATINGS AND STUDENT ACHIEVEMENT *' 



Item 


'~' ^ ~ — . ^ — 

Correlation Between Itein and 
: Total Achievement Score tr) * ' 


9 


• 




-.55** 


10 






..56** 


11 






.37* 


12 


9 




-.29 


13 






-.42* 


14 






.02 


15 






-.43* 


• 16. 


• 


r 


• .02 


17 






-.42* 


18 






-.57*** 


19 




r 


.38* . 

c 


20 






-.43* 


21 






-.58***^ 


22 






-.31 


23 






.07 


24 






.36* 


25 






- ^ • .10 


. ' -26 






.15 


27 






.10 


28 


t 




-.09, 



***p'^.001, **p<-.01, *p<.05 

Jn some cases a negative correlation implies a positive relationship- 
^ because pf the direction of the item saale (see Results section 3.3). 

Item responses were^coded as follows: A=l, B«2, C=3, D'^ and E=5 ^ 



3.4 FACTOR ANALYSIS OF THE SOST 

The factor analysis was based oh the responses of 2229 students. ' 
For a description of these subjects see TabTe 1.3. 

*^ 1 

_ Analytic ^Methods. , The factor analysis was performed by SPSS sub- 
program "Factor" with the PA2 factoring nietbod using yariwax rotation. 
This procedure calculates a principal -component solution with iteration 
and employs orthogonal rotation with Kaiser Normal iiation. Th^ factoring 
method replaces the main diagonal elements. of the correlation matrix 
with conmunality estimates and employs an iteration procedure for improving 
the estimates of communal ity. ^ 

The eigenvalue criterion for establishing the number of components 
(factors) was 0. To simplify interpretation and minimize the number 
of cross-loadings, on,ly loadings of .40 or greater were interpreted. 

•Results. Fivefctors emerged from the analysis (Table 3.5). These 
factors accounted for 55.2% of the variance in the data.' The first 
factor by itself accounted for 30.03; of the total variance. 

The communalities (total variance of an item acc9unted for by the 
combination of all coninon factors) ranged from .08 (Item 27) to .68 
Jitem 10). The average was approximately .40 (.402). 

The factorial' complexity of the rotated matrix was relatively high. 
A number of items loaded significantly on at least two factors. An 
interpretation of the factor structure is complicated by these cross- 
loadings (Table 3.6). 

Factor I: Instructional Skin ' ^ / 

The first factor is a measure of the instructor's general c|biljty ~ 
to commun-icate with and motivate students. Items,with the highest 
.loadings (10, 18, 21 and 9) assess the instructor's coherence ancT" * " 
clarity of presentation and his success in mak4ng the subject matter 
interesting and understandable. 

Factor II: Interaction , 

^ The second factor seems to relate to student-teacher rjipport'and 
the general level of v.erbal and written" exchanges between -the instructor 
and the student. 



45 



Table 3.5 FACTOR ANALYSIS OF SOSf^ 
' 0^29) 



Item 



Communal ity 



Factor I 



Varimax Rotated Factor Matrix 

_i 



Factoc II 



Factor III 



Factor IV 



factor" V 



9 

10 
11 
12 
13 
14 
15 

16 , 

17 

18 

19 

20 

^1 

22 

23 

24 

25 

26 ; 

27^ 
28' 



.44 

^68 

.38 

.45 

.38. 

.30 

.41' 

.20 

.48 

.57 

.48 

.51 

.57 

.39' 

.30 

.51 

.24. 

.31 

.08 

.36 



.60 . 
.77' 
-.52 
.48 
.35. 
.17 ' 
.32 
-.12 
.29 
.63 
■.47 
.43 
.61 
.10 
.15 
.14- 
.05 
.06 
.04 
.21 



-18 
.18 
-.1-2 
.07 

■* ••"'2 
.39 

-.31 

• .50 
.37 

-.37-^ 
.55 
.42 
.60 
.38 

-.15 

.04 

. .05 
.56 



.00 
.03 
-.05 

* .06 
.07 
-.03^ 
-.12' 
•13 
-.07 
-.03 
.26 

'-.12 
.03 
.12 
.23 
■ U 
.49 
.55 

-*11 
.06 



.22 
-.19 
.46 
.40 
.50 
.37 
-.25 
.38 
."16 

.10 

■ .11 

•11 

,13 

-•12, 
-.01 

-.03 

I 

.0? 
..05 



.7 



-.09 
-.09 

-.05- 
-.21 
.06 
-.01 

'.n 

.03' 
-.03 

g^-21 

.05 
-.04 . 
-.01 
-.24 

.66- 

.01 
-.06 

.25 
-.03 



.A 



ERIC 



54 



\ 



Factor 



I Instrucjjonal 
; skill . -fH 



Loarfinas 



Table. 3.6 FACTOR STRUCTURE OF SOST. 



46 



,60 
.77 



:II Interaction 




■ I Ir Workload 



IV i)rf«iiizati<>n 



V Feedback 



ERIG 



J-*47 
.43 
.61 



^^^^ 



Itiems 



ThS Instrudtor is dear and audible. 

10. The Instructor presented material in coherent 
.njanner ^ - - 

11. Course material was disorganized ahd hindered 
•»^pg^tanding. . ^ ' r — 

'dBjlstructor was consistently prepared for - 

. , ■ - V ■ 

18. The Instructor made this course as interesting 
as the sub jectjnatter wouW allow. ^ 

ncreasemy interest in 



19., The InstrucAoiTdid not i 
^he^ub^ct^^ter 



ft 



.50 



.55 

-\ f 
A2' 



.'60" 
.56 



.'4? 



,55 



« .46 

;.40 

.5Q. 



.66 



Mttei 

20. The .InstructiS?W)ti\tated to to put forth a 
good effort. . . ' . . 

21. The Instructor-was successful in rteking diffi- • 
cult-material uhdemandab]*,. 



. 17. 'The .Instructor maintained a generally helpful 
attitude toward students ' / 

20. The Instructor motivated rtie to put forth, a 
» . good effort. ..^^ 

21.. The Instructor was success.ful in making diffi- 
cult materiaT understandable. 

Verbal or wr1tter*eoimients on- assignments" have ' 
been constructive. 

28* The assignnients provided a valuable learning 
• experience. . ' • 



25. The Instructor's expectations for^tudent per-* 
- . f ormance were . • . . 

26?* The amount of work required for this course has 
been ...... . ' * . . 



12. Th\ ^In'Structor was conskT?tently prepared for^ 
§ class. 1 ■ . ~W . 

13. The Instructor was cleaXpn what was 'expecdd^ . 

14. The Instructors attendance and pun«tuaTity have 
jxeen consistyopy goo^ •. ' > . 



24. '.Throughout tWs course, I have noQcen able to 
. » assess. iny progrjBss and achievemint- 





The -third factor is a Wasure'of the amount p-f studpnt effort^* 
required by the instructor ahd thWourse. The two items with .high " v 
loadings on th-is factor (25, and 26) assess the aroun^ of .work ri^qu.ired 
for the cou||e and' the instnjctor's expectations for slu^nt performance. 

' raclyr IV: drganizatidh , ' 

The. fourth factor is arijawessment of the organizational skills 
^of the instructor. Items loading highly on this factor (1.2, 13, and 14) 
pertain ^o. the instructor's attendance and 'punctuality, his preparedness, 
and the clartty iiitfi which he has stated his objectives and requirements. 

Pmtor V: • Feedback ' ■ ^ 

• • • , 

• The fifth factor (Item 24) is a mea.sure of the extent' to which the 
student is able to judge his level of performance in*«tfie course. 



3-5 The effect of student variables on msj ra tings . 

J*'e?e analyses were based on the entire data pfool consisting, of 
222§ studen\ responses regardless of elass. seqtlon o^ instrupfbr (exce 
wjjijM noted). For a description of these subjects 'see Table 1.3v 

f Analytic Methods . Ana.lysis of variance 5jJCcedures wer-e used^-to' 
exai#ine the effects of the^ following variables m student Tatings: 
Oi^the student's. major (faculty srffiliatipn 'i Item 1); (2) the-,3tudenVs 
ifvel .(number of courses completed - Item 3^; (3) the 'student's percep!- 
tion of his. own' performance relative to other studentaOif the class ' 
( Item -4) ; J^^wtT^ther the course was- compulsory or elective (Item 's); -^^ 
(5) the student's perception ,of his effort in the course relat^to * 
other courses he has taken (Item 7)., .^. ^ ^ 

A series of ^ne-way multivariate and univariate analyses of variance 
were perforjned usihg the -"General Linear Models" (GLH) procedure of the. 
SAS package . A ^Ixed-effects'model (Ij was used*. Only the ijnivari ate 
F-ritios and their associated significance levels are reported. Post' 
hoc analyses were npt performed, however, means and standard deviations ' 
were cakulated on inova levtels using SPSS subprogriam "Breakdown-.". ; 

3.51 The Effect of Student *s Major on S(3St Ratings .\ 

All analyses of variance showed signif icant^differences irj SOST ♦ 



48 



ratings by s^jj^t maj9r (.TaWe,-3.7) : Business majors rated their 
^ instructors and -courses. most favourably an'12 of Jtfie 20 items'. Arts 
- majors rated «ost favoyrably On 6 of the items and Arts/Busi'ness • 

majors rated identi.CA,ny on. 2 Iteifls. .' * * - . 

- ' * 

/ ^Least favourable ratings wpre. given on 12 items by students who' 
identified their major as'-'othfer."" Scl^ence and Mathem^ics majors 
/ rated their instructors and courses liast favourably onB items." 
. ' Although no simple and totally >^6nYistent^attern emerged from ^ 

these analyses, it appears that Business major more lenient 
/ (favourable). in thei/ evaluations than non-'BujPs students. 

Furthermore. Scflende and Mathe|iatic's student^and "o.thers" tend to 
> be harshest (least favourable) in their ratings. 
^, , It should b.e pointed out that Business students constituted 
the smallest group in the safnple population (8.8%) and that approx- 
imately 30% Df^these ratings were obtained .in only 2'.class sections. 
In addition-.-thp, Science and Mathematics stydents. comprised that 
f largest, group (40.4%) an'd evaluated the largest .number of courses and 
, . -instructors. AUhougb no tests'for homogeneity of variances- were 
- performed,* heterogeneity might -account f(^ some of the observed 
• '-dlffer^ences. ' This explanation is unifkely. howevej. since F is known " 
to be rpbust with respect to departure? from homosce'dasticfty • 
(Wihep. 1971). , . . ' * 

•3-?2 The Effect of Student's Level on- SOST Ratings " ' , 

^ thirteen (13) of the 2e SOST items showed' significant differences 
by student level (Table 3.8). ' As with student major, no clear and tron-^" ' 
SI stent pattern is discernable on the basis of the number of courses 
. • completed byVthe student, although a few generalizations can -^ejuade.^ 
* Upper-level students (those havingLompleted at least 13 cours.es) 

li^ tended to rate their instructors more fS'^urably t«ln Idwer-Jevel students 
^ in terms of ability to communicate and motivate (Items 9. 18, and 20). 
In addition, apper-level students were more .inclined to rate the* 
; instructor's expectations (It^m 25^ as high and:the course^ workload 
. (ftem 2&f as, relatively heavy. Finally, hpnoyrs and graduate-level ' 
studertt$/18 or more courses) consideVed their instructor's punctuality 
' to have been better than other students (Item>14) however, they .con^'dVed 



(N»2229) 



Item 
9 

10 

n 

12 

13 

14 

>5 

16 

17 

18 
. 19 
20 
21 
22 
23 
24 
'■^5 
26 
27 
28 



1.76 (.80) 
2.10 (1.05) 
3.92 f.98) 
1.79 (.88) 
* 1.76^^ 
1.35 (.59) 
1.66 (.77)' 
3.76 (.93) 
1.86 (.84) 
2.?3'>1.11). / 
3.43 (1.15y 
2.65 (1.0(f) 
2.31 (1.00) 
2.49 (.94) 
2.22 (.98) - 
3.86 (.95) 
3.48 (.75)- 
3.42 (1.05) 
2.78 (1.29) 
- .2.26. (.95) ■ 



■/Social ' 
Science (B 

1.84 (.86) 
. *2.T9?(1.08) 
^ 3.82 (1.02) 
• 1.83 (.83) 
1.86 (.96) 
1.41 (.68f 
1.75 f.86). 
3^68 (.89) 
T;97 (.88) 
2.32 (1.10) 
3.2i (1.15) 
2. 78 .v( 1.03) 
.2.38 C^3) 
. 2.58 (.87) 
2.38 (1.02) 
3:82 '(.93) . 
3,48 (.67)" . 
jlv48 (1.02) 
2.7^ (1.21) 
i:43 (.99) 



Wafis and (Standard Deviations) bv Major 



Science 
and Math (c: 



. r.93 
2.34 
3.65 
• 1.91 
2.08 
1.53 
■ 1.93 
3. 60 
^ 2.02 
2.37 
3.24 
2.76 
2.53 
2.70 
2.6(X 
3.58 
^56, 
3.30 
2.93 
2.47 



(.9Q) ■ 
(1.06) 
(1,03) 
(.90) 
.(.99) 
(.67) 
(.94y 
(.96) 
(.87)' 
0-02"^ 
(1.12) 
(1.00) 
(.95^) 
(.94) 
(l'.'04-) 
(.98) 
(^75) 
(.86) 
(1.21). 
(.96) 



1.63 
1.89 
4.00 
1.62 
•1.74 
1.28 
1.72 
3.57 
^ J. 94 
'2.07 
3.34 
2.54 
2.31 
2.69 
2.37 
3.45. 
3.5a 
3.68 
2.51. 
2.26 



(.73) 

(.83y 

(.89) 

(.75) 

(.74) 

(,54) 

(.80) 

(.98) 

(.81) 

(.97) 

(T.IO) 

(.99) 

(.86) 

(.9.1) 

(.89) 

(1.04) 

(.86)* 

(1..02) 

(1.23) 

(-.86) 



2.24 (1.13) 
2.43 (1.09) 
.3.66 (.99). 

1 .87 (.76)' 
2.02 (.97)" 
"l.56 t.74) 
2.07 (.97) 

3.51 (.89) 
?,21 (.90) 
2.66 (1.16) 
2.98 (i.20)^ 
3.10- (1.04) 
2,66 (.1.13) 

.2.79-(.88> 
2.43 (.99) 

3.52 (1.02) 
3.40 (.87) 
3.63 (1.03) 
2.76 (1.2?! 
2.65 (1.05) 



17.08 
12.41 
7.20 * 

9.33 

n.73. 

11.37 

12.'gb 
3.70** 

6.53* 
10.19 

6.09 
10.53 

6.93 

4.31** 

7.40" 
11.17 

2.73* 

3.23*. 

4.52** 

5.94 



■ Business (S.Stl; Otber {13JI)- l'3.1«). Social Science (24.61); Science and Math (40.4J); 



All -Fs are significant at. p c.OOOl with exception of asterisks (-p<.01. *p<.05) 



Table 3 



50 



.8 THE EPFECf OF STUDENT'S LEVEL OH SOSTTWTINGS 

(N«2229) 





Mearfl and (Standard Deviations) bv. Level 


■9 


[tern 


• 0-2 (A) 


3-7 (8). 


' 8-12 (C) 


. 13-17 (D) 


18- (E> 


F ra-tib^ 


9 


UBs' (.'84) 


1.99 (.96) » 


2.13 (1.06) 


1 84 (^ Q?) 


1 79 ^ ^\ 


Q no 




2.26 (1.04) 


2.22 (1.05) 


2.35 (1.06) 


2 13 (1 UV 






11 


. 3.80 (.95) 


3.70 (-1.05) 


*3.SB (1.04) 


3 77 (1 11) 


J ^py 1 1 . 1 u J . 


1 .90(NSJ 


12 

ft ; 


.1.91 (.85) 


1.83 (.80) ' 


1.81 (.80) 


V fiS f fil ) 


M Qi ^ Q^;^ 




1.90 (.93)' 


" 1.93 (.91) 


1 99 'I 93) 




c.dU t 1 . 1 / J 


0.06 


14 


1.46 (.67) ' 


1.47 (.67) 


1.53 (.74) 


1 48 ( 641 ^ 




Z. 52* 


15 


1.85 (.91) 


1.82 (.88) 


. 1.98 (.96) 

• . ^w y * ^ ^ / 


1 79 ( fi7) 


I .OU \ .0/ J 


1 *90vN5/ 


16 


3.57 (.87) 


3.66 (.92) 


3.73 ( 96) 

W.#N^ \ m f 




, , J . tXD U • 1 1 ) 


• D.94 • 


17 • 


.2.03 (.85r 


2.00 (.88) 


2.09 ( 91) 


1 QO t ftp) * 


V 

1 .94 .(,89) . 


0 AC / UC \ 

. Z.05(NS) 


18 


2.40 (1.06) 


2.41 {1.09) 


2.41 (1 13) 




' 9 9^ Ai m \ • 
f,£<: \l,.0/J 


* 5.97 


19. 


3.20 (1.T2) 


3.18- (1.1-9) 


3.26 (1 18) 


3 46 M 


9Q /I 1C\ 

, o.a ^ 1 . 1 3 J 


X U79(NS) 


2Q, 


2.88" (!99) 


2.80 (1 .07) 


2.73 (1 OV) 






1 J . 44 


21 


,2.47 ('?96i 


2.46 (.*93) 


2.55 (1.02) 


2.32 (1.05) 


?.44 (1.02). 


1 . 43 ( NS ) 


22 ; 


2.69 (.86) 


. 2.64 (.93) 


2.59 (..94) 


/2.51 (1.00) 


^ a.65 (1,03) 




23 


2^47 (1:02) • 


2.40 (.99) , 


2.40 t.97) ' 


2.26 (l.bo) 


2.64 (1.06) 


3-. 80*^, 




IW (.96) 


"3.63 (1.02) 


3.58 (.96) 


3.63 (KOb) 


3.50 (1.05) 


5.70 


25 


3.42 '{^S^)m 


3.47 (.77) 


3.38 (.82) 


3.70 (.79) ' ; 


3.88 (:83) 


17.2?* ' 


26 


3.54 (.95) 


3.48 (.98) . 


3.26- (.9ff) 


. 3.55-(.^) 


3;'6.6 (.97) 


■ ;5,;z 


27 . 


2.69 (i.22) 


2.93 .(1 .26) 


- 3.02 (1.18) 


2.95(1.22) . 


2.86' i 1.23) . 


.,4.92***". 


28 


2.45 (.93) 


2.45 (1.01) 


2.50 (1.02) 

* 


2.38 (1.15r, 


2.4t) J...97)-' 


0.76(NS) 



The number of missing cases range**! from 1.2« to 5.2X; Based on 2202 responses / 

Lr^^^"3' breakdown o'f responses by levels was: 0-2 (52. 0«); 3> - 
(15.3%); 8-12 (10.5%); 13-17(8.4%); 18- (13.7%). ; " - . 

• . . ■. . . 

*V are, Significant at p <.001 with exceptions of (NS) andf asterisks (^*p<.00l . • 
"p<.Ql, *p<.05) , . / 



ERIC 



' tbe'fevaluation system less fairly applied (Item 23), their ability to 

assess their own progress and achievement less marked (Item 2,4)', and 

the inst,ructors expectations less clearly deTineat.ed (Item 13). 

In general these findings are suppoVtive of previous work which 

has shown that upper-level students tend -to evaluate instructors' 

presentations more positively (Table 2.8). 

I • . 

3-53 The Effect of Student's Performance on SOST RatiTiq . 

> • ' ^ . 

Sixteen (16) of* the. 20. SOST items showed signtficant differences 

/when responses were classified by' the student's perception of his own 
•performance relative' ta other students in thF~class (Table 3.9). 
These^ findijigs are Interesting on seve»»al accounts. 

Quite aside from the question at hand, the percentages of students 

.who classify themsalves In each category Is at least of passing Interest. 
Over-88% of the students see themselves as "average" or "above average*." 
Almost 51 classify their performance as "superior" and only ]% see them- 
.selves failing. It might be interesting to compare Mes.e self-appraisals 
with grades actually received in courses. It would appear that. students' 
tend t6 cluster themselves. 1n the centre of a grade distribution, perhaps 

. to a greater extent $han their professors do. Our guess is that it is 
a rare profes^r (In these days of "grade Inflation") who assigns onl^ 
5 "As" and 1 "F" In a class -of 100, students.' 

The results of the analyses of variance are equalljt Interesting. 

-A clear and fairly consistent pattern indicates' that students .who see 
themselves a.s "below average" or "failing" tend to rate their Instruc- 
tor a.nd the course less favourably than other students, iqually con- 
sistent findings show that students who pevceive their performance as 

'"above average" or "superior" rate thilr Instructors as more effective. 

'the feedback as "constructive," th^e evaluation system as "fairl^^ applied, 
and the assignments' as a "valuable learning experience." It appears 
then that a direct (and perhaps lineaV) relationship exists between 
students' perceptions of tlieirown performance relative to others in 
the class and t^eir evaluation of the instruct*-" and the coursfe. 

• These finding^ are probably not surpri|i* to many, however, it 
is interesting that, similar findings have not been widely reported (tb 
our knowledge)^, The findings. ^furthermore, tend ,to cast some doubt on 



-.it. 



tem 



52 



Table 3.9 EFFECT OF STUDENT'S PERFORMANCE ON SOST RATINGS 

- (N-2229) ' ^ ^ 



1 



Means and (Standard Deviations) bv 



Superior (A) 



K89 
2.13 
3.84 
1 . 75: 
1.83 
1.33 
1.81 
3.73 
2.01 
2.^8 
3.44 
2.62 
2.27 
2.61 
2.30 
3.91 
,3.49 
3.30- 
2.94 
2.24 



(1.14) 

(1.12) 

(1.14) 

(.87) 

(.:92) 

(.€0) 

{.93) 

(.96) 

(.94), 

(1.14) 

(1.24) 

(1.05). 

(1.00) 

(.93) 

(.95) 

(1.04) 

(.84) 

(1.03) 

(1.28) 

(.86) 



Above 
Aver^age (B) 



1.82 (.89) 
2.18, (1.07) 

3.83 (1.04) 

1.89 (.97) 
1.45 (.69) 
1.^0 (.87) 

3.66 (.96) 

1.93 (.83) 
2.27 (1.06) 
3.35 (1.11) 

2.67 (1.02) 
2.37 '(.98) 
2.58 (.90) 
2.38 (.95) 
3.85 (.91) 

. 3.^3 (.78) 

J.U (.97) 

2.94 (1.25) 
2.42 (.96) 



Average (C) 



1.94 
2.28 
3.73 
1.90 
1.99 
1.49 
1.88 

2.oV 
2.41 
3.17 
^2.82 
2.50 
2.69 
2.47 
3.55 
3.52 
3.57 
2.75 
2.45 



(.90) 
(1.03) 
(.96) 
(.83) 
(.95) 
(.67)' 
(.90) 
(.92) 
^.88) 
(1.07) 
(1.15) 
(1.00) 
(.94) 
(,91) 
(1.00) 
(.98) 
(.74) 
(.94) 
(1.19) 
(.97) 



Perfonnance 
Below 
Average (D) 



2.06 (.94) 

2.40 (1.18) 
3^57 (1.04) 
1.89 (.84) 
•2.05 (1.07)^ 
1.43 (.54) 
1.82 (.91) 
3.52 (.9-3) 
2.11 (.98) 
2.42^1.08) 
2.98 (1.23) 

S 3.04 (1.01) 
2.ai (1.04) ■ 
2.73 (.99) 
2.77' (T#9) : 
3.16 (i.iy 

^ 3.44 (.67) 
* 3'. 76* (.97) 

2.41 (1.18) 
2.56 (l.Vl) 



Failing .(E) 



2.05 (.95) 
2.55 (1.18) 
3.24 (1.26) 
1.77 (.61) 
2.23 (1.02) 
. 1.27 (.55) 

2.32 (1.21) 
3.55 .(.80) 
2.50 (.86) 
2.64 (1.26) 
2.64 (1.00) 

3.33 (1.15) 
-2,95 il."l3.), 

3.18 O.OS) 
3..23*(1.27) ' 
2. 95 -.(1.25)"* 
3.23 (^l.U) 
3.59' (1.33) 
_^55 (1.50)- 
2.90^1.14) 



F ratio' 



3. S3** 

a.«i*' 

1.1T(NS 
2.47* 
2>30(NS 
2.61* 
1.45(NS 
5.. 59*** 
3.16* 
5.73*** 
6.47^* 
To. 04***' 
4.66*** 
' 9.01***' 
20.48***^ 
1.22(NS 
6.71***' 
5.54*** 
2.40* 



The nufnber of missing cases ranged from 0.7% to 4.8%; Based on 2213 responses {0.7% 
7JVJl^'» breakdown of responses by performance was: Superior (4.9«), AboVe Avg. 
38.5%), Average; (49.61), BeloSwiAvg. {€.]%), Failing (l.Oi). ■ 

'**p<.0001, ***p<.001^, **p<.01, *p<.05' 



er|c 



'2 



53 



the validity of individual student-ratings though not necessarily on ' 
ratihgrTeceived by. an instructor an entire cVass. 

The Effect of Course Status (Compulsorv/Elective) on SOST Rating 
Prior to analysis the 'responses of students who we^e "not sure" of 
th'eir course status ^were dropped from the data pool resulting .in 2139 ' 
useable responses (95.9% of original data pool). . Oyt of convenience, 
the analysis of variance procedure was used even though the "Student's 
t-test" is more often employed in a 2-group desjgn. t is nothing more 
than^a "step-down" of F and both yield .findings having identical 
"signi-ficance levels." 

f 

Significant differences wpre found for 18 of the 20 SOST items when 
responses were , class if ie(f by course status (Table 3.10). In every case, 
students r^ted "elective'! courses more favourably than "compulsory" 
^ courses. The two items for which no differences were found asked 
students to assess the instructor's attendance and puntual ity (item 14) 
and his expectations for student perfof^nce (Item 25). 

The finding that "elective" courses are more attractive to students 
than "compulsory" co^urses is not particularly surprising. ' It does," 
however, again question the validity of individual student ratings'and. 
in some cases. -even class ratings. These results are generally consis-* 
tent with previous research (Tab]j|^. 10). 

3-55 The Effect of Student's Effort on SOST Ratings 

Significant differences were found among 16 of the 20 SOST items 
when responses were classified by the student's perception of his own • 
effort relative to his effort in other courses (Table 3.^1). Where 
differences were found, a consistent patten of r^ting^ by efiqrt 
emerged. , . 

students who reported their effort as "excellent" or "above average 
consistently rated the instructor and the course mere favourably than, 
other students. Moreover, those who indicated that' their effort was 
"below average" or "poor" gave the least favourable evaluations. 

V One itiight profitably speculate about the relationship of student 
eff<<rt^to teaching evaluations. It might be" that students who "try 
harder' are, more -liljely to succeed and thereby see the instructor and 



54 



Table 3.10 EFFECT'OF COURSE STATUS (COMPULSORY/ ELECTIVE i 

SOST RATINGS (N=2139)' 



ON 





= ^= ' 

Means and (Standard Deviations) by Course Status 


• 

- 


Item 


Compulsory (A) . 


• 

Elective (B) 




9 


C ' » F — 

2.00 (1.00) 


1.77 (.78) 


33.73 


10 


2:33 (1.09) 


2.16 (^01). 


■ 13.82** 


11 


3.64 (1.05) 


3.89 (.96) 


31.37 


12 . 


1.92 (.86) 


\ 1.81 (-85) 


7.99** 


13 


• 2.0| (1.00)*. 


>.83 (.91) 


31.85* 


14 


1.^7 (.65) 


. [ 1.44 (.67) 


0.71 (N§) 


. 15 . 




1.73 (.79) - 


24.79 


16* 


3.55 (.95) 


3.72 (.90) 


14.65 


17 


■ 2.09 (.92) 


1.92 (.79) 


19.75 


18 


2.44 (1.10) . 


2.24 (1.04)' 


16.20 


19 


3.09 (1.15) 


3.39 (1.12) 


40.23 


20 / ' 


2..84 (1.04) . 


' . 2.71 (.99) 


6.77** 


21 


, 2.57(1.00) 


2:35 (.93) 


' 2^.85 


22 • 


2.77 (.93) 


2.53 (.89) 


,25.38 ^ 


23 ' 


2.62''(1.03) 


2.27 (.97) 


51.14 


■ 24 


3.54 (1.02) 


3.79 (.92) ^ ' 


32.98 


25 


•3.53 (.79) 


3.50 (.72) ' 


O.OO(NS) 


26 


3.66 (.92) 


3.37 (1,00) 


39.60 


. 27 


2.75 (1.22) 


2.86 (1.24) 


4.00* 


28 


2.53 (.9«f)» 




2.34 (.94) 


17.58 - 




fsponses 
his analyj 
"compulspj 
consi 



f 

udents who were "not sure" of course status were dropped from 
Of the remaining 2139 respondents 53.6% evaluated a 
course, and 45%8X evaluated an "elective". Missihg cases . 
the remaining 0.6%. 



^ All Fs significant at p<.0001 except asterisks (***p<.001, **p<.01, 
• *p<-05, - ''^ * 



ERiC' 



6-'i 



55 



r 



Table 3.U THE EFFECT OF STUDENT'S EFFORT ON SOST RATJNGS^ 

(N»2229) 



Excellent (A) 



Means and (Standard Deviations) by Effort 



Above 
Average (B) 



Average (C) 



Below 
Average 'iP)' 



Poor (E) 



F ratio' 



1.79 (.98) 
2.J2 (1.18) 
3.63 (1.26> 
' 1.67 (.87) ; 
1.93 (1.07) 
1.31 (.&1) 
1.71 (.95) 
3.70 (1.07) 
1.82 (1.00) 
2.13 (1.15) 

3.42 (1.23) 
2.26 (1.10) 
2.20 (1.00) 

2.43 (.1.0(y 
2.40 (1.15) . 
3.86 (1.02) 
3.86 (.94) 
3.99 (1.04) 
2.56 (1.31)' 
2.12 (1.01) 



.1.83 (.89) 
2.20 (1.06) 
3.79 (1.03) 
1.83 (.87) 
1.92 (.98)- 
1.41 (.62) 
1.83 (.90) 
3.68 {.94) 
1.96 (.86) 
2.27 (1.05) 
3.40 (1.12) 

2. -62 (l.Olj 
2.39 (.96) 
2.61 (.89) 
IM (1.03) 
3.75 (.97) 
3.57 (.75). 

3. '74 (.88) 
2.7^(t,22) 
2.35 (.93) 



1.96'trS9) 
\^2.26 (1.00) 
'3.77 (.92) 
, 1.91 (.80) 
U98 (.94) 
1.52 (.69) 
1.89 (.89) 
3..58 (.80). 
2.06 (.83) 
2.44 (1.05) 
3.11(1.12) 
2.95 (.94) 
2.54 (.96) 
2,. 70^ (.90) 
2.49* (.98) 
3.58 (.95) 
- 3.42 (.69) 
3.31 

2.91 (1.19) 
4.52 (.96) 



2*03 (.96) 
2.48. (I.IO) 
3.67 (.99) . 
1.97 (.92) 
1.94 (.90) 

1.55 (.78) 
1.90 (.84). 
3.-53 (.89) 
2.14 (.a^). 

2.56 (1.14) 
2.89^1.^-7) 
3.29 (.84) 
2.77 (.96) ^ 
2.85 (.93)' 
2.46 (.90) 
3.39 (1.07) 
3.29 .(.76) . 
2.94 (1.16) 
2.94 (1.22) 
•2.84 M. 05) 



06)^ 



2.35 (1.23) 
2.92 (1.35) 
3.81 (.98) 

2.42 XI. 
2.19 (.98) 
1.73 (.92) 
2.12 (1.07) 
3.42 (1.10) 
2.62 (1.17) 
2.73 (1.31) 
2.54 (1.07) 
3.56 (1*.00) 
2.81 (1.13) 
3.17 (1.00) 
2.84 (1.21) 
3.12 (1.21) 
3.31 (.84) 
2.65 (1.23) 
3.35 \\^^ 
loo (1.12) 



4.78*** 
5.46*** 
1.53(NS) 
6.9^ 
0.77(NS) 
7.40 

.2.27(NS) 
3.^0** 
7.11 
6.22 
14.82 
40.63 
11.34 
6.71 

1.38(NS) 
9.08 
13.05 

55.'>02>' 

14.98 



The number of missing (^ases ranged froni-0.7| to 4.8%; Based on 2214 responses (0.7% 
yjo^i;?^ 5 responses by effo?t was: Excellent (10.3%), Above Avg. 

(39.9%), Average (41.6%), Below Avg. (7.0%), Poor (1.2%). " 

1 

All Fs are slghiflcant at p<.0001 with exception of asterisks (***p<.001, **p<'.01 ) 

I - 



ERIC 



:5 



56 



course in a more favourable light. This, however, is pure spec- 
ulation and the findings are" insufficient to support such a causal, 
.relationship, \ 

To our krfowledge. findings of this sort have not been widely 
reported on the literature. The results again (u^estion the notion* 
• that individual student ratings ar^ not biased *y presumably Irrele- 
vant factors. . N ■ ' 

3-6 THE EFFECT OF INSTRUCTOR VARIABLES ON SOST RATINGS 

Ss_ These analj^ses were based on the responses of 2229 students 
who were enrx)lled in- 93 class sections taught by 53 different Instruc- 
tors. For a description of these subjects sefr Table 1.3. 

Analytic Methods .'^Analysis, nf variance procedures were used to * 
assess the effects of the Instructor's rank and sex on SOST ratings! 
All analyses were performed on class means (N=93) Since mean ratings 
are most ofteft used to assess teaching effectiveness. • 

Mean ratings for individual .Items vilthin class^ sections were 
calculated using SAS proced^jre "means." One-wa-y analyses of variance 
were performed on class ratings by 1nst»^uctor's rank and by Instructor 
sex. The "GLM" procedure of the SAS package was used in these analyse 
A fiwd-effects model (I) was employed. On]^ univariate F-ratios and 
their significanc^levels are reported^- Student "t-tests" were not 
performed even though they are commonly used in 2-grQjup designs. 
See sectw 3.54. A breakdown of means' and- standard deviations by 
sex and rank was accomplished using SPSS subprogram "Breakdown. " ' 

A description, of instructors* by sex and rank is presented in 
Tabled. 2. / 

3-61 The Effect of Instructor's Rank on SOST Ratings 

For purposes of this analysis;4 academic ranks were' Identified : 
Professor; Associate Professor. Assistant Professor, and OtheV. The. 
category of ''of her "Includes all non-professorial teaching staff 
including Lecturers.^ Instructors^L/and Teaching Assistants. 

Five (5) of the SOST items showed significant differences when 
class means were categorized by .^ht sc^defiitc rank of the instructor. ' 
In all cases the ev.aluat1onsjend%d> favour senior staff members. • 



57 



(Professor. Associate' Professor) over jun^r staff members (Table 3.12) 

Senior staff members were judged to be moA consistently pre.- ' 
pared for class (Itenf'l2), more readily available ftfr consultation • 
(Item 16), generally n»re helpful in" their attitude toward 'students 
(Item 17). better able to motivate students (Item 2a). and less 
demanding in the amount of work, required (Item 25). These findings 
may be surprising.to some. howeveV, they are consistent with previous 
work in the area (Table 2.9). 

It had been our belief and that of many others that students 
perceive senior^faculty as -far too busy with research, and profes- 
sional matters to be available and helpful to undergraduate students. 
Interestingly, this turns out not to be the case. " 

These findings, however, do not addtess a more important 
question: ■'Do favourable ratings imply that senior faculty members 
are in fadt more effective teachers than 4«nior faculty members"? 
Or.' put another w?y. "Are students biased in their ratings with 
respect to thgir instructors' age, experience, demeanor and general 
appearance or do they in fact -learn more effectively 'when taught 
b^ senior faculty members"? Tantal izing -as this question is. i^ 
is. simply unanswerable on the .basis of the available evidence. 
3-^2 The Effect of Instructor's Sex on SOST Ratings ' 

Significant differences were found on 7 SOSJ items when class 
section means were classified by instructor's sex. ' Where differences 
were found, m&ati r^tirtgs consistently faWufed male instructors ovir 
female instructors (Table 3.1^. These findings are not consistent 
with a large body -of evidence which tends to show that stijdent rating's " 
are not affected" by instructor's sex. • - , 

. Logically, one might entertairv 3 possible explanations for 
these findings*: • . 

1.) the male instructors in. the sample were in -fact more, 
effective teachers than the female instructors 

2) the students were biased in their evaluations 

3) the effects of, instructor sex are confounded. -by other 
variables. 



58 



Table 3.12 EFFECT OF INSTRUCTOR'S RANK ON,SOSI RATINGS 

^N=93) ' ■ 



1 



Item 



Means and (Standard Dev^latlons) by ^nk 



Professor 



Associate 
Professor 



Assistant 
..Professor 



Other 



F rat1(j>. 



9 

10 
11 
12 
" 13 
- 14 
• 15 
16 
17 
1.8 

ih 

20 
21 
22 
'23 
.24 
25 
«6 
27 
28 



1.9? (.76) 

2.04 (.66) 
3.95 (.54) 
1.64 (.53) 
1.73 (.60) 
1.24 (.19) 
1.64 (.36) 

3.98 (.48) 
1.64 (.38) 

2.05 (.60) 
3.55 (.54) 
2.34 (.51) 
2.23 (.56) 
2:13 (.50) 
2.15 (.58) 
3.55 (.52) 
3.51 (.24],.. 
2.81 (.48) 

2.99 (.55)- 
2.13 (.54) ' 



1.78. (.57) 
2.21 (.61) 
3.78 (.72) 
1.77 (.34) 
2.06 (.51) 
1.54 (.31) 
1.69 (.51.) 
3.97 (.51) 
1.77 (.47)^ 
2.08 (.59) 
3.46 (.4B) 
2.50 (.56) 
2.35 (.52) 
2.49 (.44) 
2.29 (.35) 
3.60 (;38) 
3.53 (.25) 
3.35 (.5-0) 
2.90 (.24) 
2.20 (.35) 



1.74 (.39) 
2.11 (.71) 
3.66';(.64) 
1.86 (.73) 
1.98 (.45) 
1.47 (.33) 
1.89 (.45) 
3.62 (.48) 
2.10 (.40) 
2.26 (.56) 
3.31 (.52) 
?.60 (.47) 
2.37 (.54) 

2.44 (.49) 
^2.41 (.58) 

3.45 (,45) 
3.49 (.50). 
3.61 (.51) 
2.88 (.54) 
2.29 (.36) 



1.84 (.32) 
2.^3 (.45) 
3. 81 J. 40) 
2,06 (.45) 
1.98^(.44) . 

1.54 (.^»' 
1.71 (.32) 
3.57 (.24) 

• 1.84 (.27) 
2.37 (.43) 
3.24 (.44) 
2l81 '(.38) 
2.33 (.36) 

'2^46 (.40) 

2.55 (.44) 

3.70 (.34) 
l42 (.30) 

3.71 (.36) 
^74 (.43) 

-2.33 (.28) 



0.8UNS9 
0.4^^ 

0. 70(NS) 
2.70* 
1.06(NS) 

1. '95(NS) 
1.42(NS) 
6.42*** 
4.87** 
1.95(NS) 
1.70(NS) 
1.31** 
0.2>(NS) 
1.70(NS) 
2.59(NS) 
?.03(KS) 
0.55(NS) 

13.32**** 
1.37(NS) 
1.20(NS) 



LJmJo s i sections. 10 were taught be Professors, 14 were taught by 
Associate Professor. 20 were taught by Assistant Professors, and 49 \^ere 
taught by "other" staff members, primarily T.A.s.^ ■'• v/ere 

****p<. 0001 , ***p<. 001, **p<y01, *p<.05. 



* 



ERIC 



6S 



59. 



*TaWe 3.j3 EFFECT /)PlNSTRltf TOR ^ SEX ON SgSJ RATIN8S^ .. " 
• ' v^- . " (N=93) . # : ' • 




Item ■ 



Means and (StandanJ Deviatfbns) bv Sex^ 




> Female' 



F ratio 



9 

10 
11 

' 12- 
13 

> "■14 
15 
16 

18 

20 

^ 21 
22 
23 
24 

■ .25 
26 




1.92 (:52j' 
=- 2.30 (.541 ' 
. 3.69 (.55) 

2.01 "(.3Z') 

2.17 (.54) ■ 
1.55 (.40); 

^.81 (-.38) 
3.59 (.32) 
1.89 (.30) 
2.42 (.48), 

3.18 (.47) 
2.81 (.44) 
2't39 (.46) ' 
2.52 (.36^ . 
2.67 (.45) 
3.57 (40)- 
3.48 {.32) 
3.67J.47) 
2.8P^.43) 
2.38 (.35.) 



t.-35(rB) 

.,3!26(1«S) 
, 2.l'6(NS) ' 

T.46'(NS) 
■^13.28*** • 
, li56(JJS) ■ 
, *2,27(NS) ' 
3.84(NS) 
0'22(NS) 
» 6.47* 
"5.74* 
6.22* 
. 1.17(NS) 
.3.21(1*5) ' 
16.17**** 

0.98CNSK 
.O.ll(NS). 
T.99* 
0.04(NS) . 
5.78* 



1 ' ' . '.- . , ^ 

Of the 93 class- secfcions 55 were taught by'male mtructors and 38 we're 
. taught by female instructors*. 

2 * . • ' ■ • 
****P 7.0001, ***p<.001. *p<. 05. 



V 60 ' 

Tbe tW^d explanation is perhaps' the most lively ' 

An examination of Table 1.2'shows that tfie large majority sf 
the female instructors in the s^inple are found in the lowei^aca^ic 
4anks (as they are at the University, in geneija-l)-. As a' rSftT 
the effect of instructor sex -may be confoundeci by the effectlf . 
academic rank. In order to test th>s .hypothesis one could, perform 
a 2-way analysis of variance thereby .partial! ing- out the variance • 
attribytab-le to each of the main effects (sex an* rank). Unfdrtun- 
ately. the sa^je size is'too smalTfor an adequate analysis. -Such, 
an arfelysis would be based on an experimental dei^ign containing 
a numbej" of near-empty cells. ' ' • ^ > * 

7 THE fFFECT OF GLASS VARIABLES. ON SOST RATING'S . ' C 

J Ss. These'^^naly^es were based on .the respons'^of 2229 students 
who were enrolled i^93 class sections taught by 53 different instruc- 
tors. For a description^of these subjects -see Table 1.3. 

/- . ' . / • 

. ^Aha lytic Methods. Analysis of variance procedures were, used to 

assess the effectsV clas-s size and meeting time on SOST ratings.. 
Ail arfalyses were performed on class meaLj.(N=93') . F,6r a. further 
descripjtidn bf analytic. methods see* section 3 6' 

The Effect of Class Size on SOST Ratings ' ' - 

Each of the 93 class sections was categorized as either "smaTl", 
••medium" or "large'!. 'Operationally, a small class iJ^Ta^fined 'as' • ' 
.having fewer than 20 students; a medium cl'ass as 'having^ 20 to 50^ 
students i- and a largp class 'as having more than 50. students. The^ 
mean class ?ize fo^ all sections was approximately 24 (23.97). 
^ /he results of the analyses of variance shbwe* significant 
differences for 10 SOSJ -items when section mean responses were '. 
classified by class size (Table 3.14). Mean's^tion rat'ings generally 
favoured 5mall and/or mec^iuiti sized clajf^es over large' classes. These 
findings support Rrevio«s ^ork on' the pffects of clais si?e on^^h 
evaluations (Table 2.10). " ' ^ " . ' : 

■ 7\ long-standing debate among edacators and psychologists has '" 
centre^ Voun4 the eff^ect of c lass sjze ,on school " learning." Do students 



61 



Table 3.14 EFIS^T OF CLASS SIZE ON 50ST RATINGS^ 
• ,i ■ ' (N*93) 



0 



Item 



•9 
10 
"11 
12 
13 

15 
16- 
17 
18 

• i9 
20 
21 
22 
23 

'24 
25 
: 26 
27 . 
. 28 



Meqns and ( Standard peviatlons)" bv Class Size 



Sma-n (<;20> 



S 

1.72 (.31) 
2.12 (.53) 
3.85 (.57) 
1.96 (.59) 
2.02 (.46) 
1.55 (.40) 
1.67 (.3/:.) 
' 3.72 (.50) 
1,.74 (.27) 
2.20 i. 49) 
43^34 (^51) 
2.58 CM) 
..:^2.19 (.42) 
2.2n.47) 
,2.4M."64)' 
3.64 (.44) 
3.44 (:38jl» 
3.41 (.46) 
3.01 CSO^ 
»/l4 (.34);' 



Medium (20-50) . 



) 



.1.82 (.30) 
2.19 (.55) 
'3.81 (.48) 
1.98 (.55) 
1.78 (.37) 
^ 1.48 (.44) 
1.67 (.31) 
3.68^^.29) 
1.82 (.32) 
"2.27 (.44) 
, 3.41 (.41) 
(.39) 
2.38 (.39) 
,2.45 (.42)- 
• 2.31 ,(.27) 
3,75 (.31) 
>*Sf46 (.29)^ 
3,72 (.42) ' 
/2.60^",(.34), 
2:29 (.28) 



Large* (>50) 



2.01 U5) 
2.27 (.62) 
3.65 (.50) 
1.79 (.3fr) 
* 2,13 (.S'4) 
1.42 (.17) 
1.9§ C.45) 
3.56 (.37) 
*2.13.'(.'43.j 
2.38 (.62) 
.3.17'(.49), 
2.>3 (.47) 
2.U (.48) 
^66. (.34) 
2.58 (.38), 
'>;3f (.34) 
.3.51 (.31)-' 
3.50 (.62) 
2,78 (.32) 
2.-50 C.-Sl) 



F ratio 



3.34* 
0.51 (NS) 
1,14(NS)' 
0.98{NS) 
4.88** 
0.97CNS) 
• 5.34** 
'^1.76CNS) 
10.45"**** ■ 
-0.84(NS) 
1.71(NS) . 
'1.30(NS) 
4.49* 
6.13** ' 
2.08(NS) 
6,04** 
0.37(NS) 
3.44* 
8.43*** . 
9^38*** 



?I sections 40 were classif ied^as ^'small"' (fewer than' 20 " 

l^ssfffld • r^'^l'? "medium" (20-50 stddent )7and.2l\ere 

classi^fied as large" (more ttian 50 students). Average -class , si^e was 



/ 

,4 • 



7^- 



.61 

« actua.ny "learn ■more" in_^a;small class? While evidence exists both , 
for and against this .proposition,, it is" generally agreed that-^eachers 

■ and students alike prefer sillier classes to. larger ones. The results, 
of their analyses should,, therefore^ not be surprising. 

; 3.72^ The Effect of Class feting Time on SOST. Ratings 

• Significant differfenfts "were' found for 3 SOST ^temg when seCtiorv 
mean responses were classified by class meeting time* (Table 3. IS).-. 
These findings do not support the *ften-heard contention, ttwt morning 
-classes are rate.d more favourably than mid-day and afternoofj^ classes. - 
The results indicate- -that students enrolled in- afternoon and v \ 
evening classes feel that verbal and written *^edback have bed n' more" • 
• constructive than ■ students in other classes (Item 22)^" Furltf^rmore,* 
afternoon and evening students feel that' the work required by tN^' 
course was 1 ess' irftensive. { ItM 26} and they ara less I'ijcely .tb .indi- 
'.cate that the material .waa» b#orjd . their previous^cademic experience' 
(Ite<n 27.). - , ^ , • - ^ . " ' ' ^ ' . 

IV SUMMARY; CONCLUSIONS /VND RECOMMENDATIONS 
■- . '« • • ' ■ •• 

This sectiT)h suim»ri2es the findings of the study, presents "thfe 
■conclusions, and forwai*s se%*ral recommendations concerning the devel- 
opment and use of teaching evaluation instruments at the University of 
Windsor, 

4.1 • SUMMARY 

— — V 

4 JT- Internal' Consisteircy " , 

The internal consistency of the SOST (using CronVach's alpha - 

coefficient) was fbunc^ to be modeVate td relatively high on three of " 
•the^subscales (Sectibrt^A. B, and C). However, the alp-ha coefficients 

for Section D" (.37yand Se(;tion B (.19) were unacQeptably low. This 
•finding is consistent with^e factor analysvis which shows that items 

in Sections- D and- E lgAdi on separate factors, 

* , . - . , 

4.12 Stability ' V 

1-*- ' . . . • ■ 

The stability {oe^ficients for the SOST were foufd to be mdberate 
to low, but generally, within the range reported for oth^V teathing ^ 



'63 



Table* 3.15' EFFECT OF CLASS MEETING TIME ON SOST RATINGS^ 

• (N«93) V. , : 



Means and (standard Deviations) by 



Uem 


Morninq 


Mid-day 


9 


1 .83 ( .41) • 


^ M I id") 


. to 


^ ' ^ 2.23 ( 61 T 


c. . \\j \ • DO f 


11 


3.78/52) 




12 


1 .9*1 ;(-48) • 




'13 ') 


1.90 (.37) ' 




• 14 


1.51 (.30) 


1 47 f Vq) 


15 

• 


1 .83 ( 48) 


1 6R I "^E^V 


• 16" 


3 64 ( 46) 




. 17 


T.98 ( 50) ' 




18 


2 33 ifbl) 






' 3.19 ('58) 


'44* ^ '^<i^ 


• 20 V 


,2.79 (.53) 


2. 61 C'42), . 


21' 


- ' •■ 2 -44 f -511 "'^ " 




■22 


-^61 '(.45) 


■-2.40-\(.43) ,' 


■ 23 


2.42 (.36). 


' 2.44' (.57), 


^24 , 


3.58 .(.35) 


3.-62 (.45) ^ , 


. 25 ' 


3.42 (.31^ 


#1.4S"(.37) 


26 


3,71 -(.36) , 


.3.53-(.52) 


127, * 


2,63 (.~30) 


2£f (.52) ' 


28 •■' , 


2.35 (.32)'! 


■ 2'?^4 (.35) 



Meeting Tiflie 



Afternoon/Evening 



- 2 
F ratio 



. 1.78 (.42) 
'2.15 (.52) 
^76 (,61) 
1.95 '(.561 
2.06; (.47) 
1.52 (.40) 
' -J. 80 (.32) 
J. 67 (.41) 
v*1.88 (.Vl) 

2.23 (.48) 

3.24 (.48) 

2.63 (.47), 
!.^3' (.42) 

'*'2\"2rtT39) 
■ ,2.47 <."-49) 

3.64 (\37) 
,, ^3.52 (.29)' 

3.34 (.58) 
: 2.94 (.38) 
''.'2.27. (.35) 



'I 



0.'13(NS) 

0.17(NS)> 

p.05(NS) 

t).05(NS) 

0.76TNS) 

0.16(NSj 

2.11(NS) 

0.39(NS) 

2.7grNS) 

,0.2a(NS) 
,2r73(NSi 
1.23(NS) 
"L46(NS) 
4,61* * 
0^06 (NS) 
0.T6(NS) . . 
0.5^NS) 
3.42*/ 4» 
,3.75* \ 
. 0'.8'4(*NS) 



Twenty-s^ver\ (27) class^met^in the morning (9:00 or 10:00 A.M.), 4/ ' 
' classes njtft at mid-day (11:00, "12:00 or kOQ) and 23"class-es met.irv'the 
afternoon or evening (2:00 -,7:00 P.M.). , • * " 

*p<.Q5.' ■ -• • ' / 



64 



evaluation instruments. MearT coefficients were,: ..58 (7 day i/iterval,' * 
.53 (14- day^erval), .^9 (21 day interval^ and .43 (28 day interval)!. 

^.•^3. Relationship Between Ratings an'd'student. Achievement ' ' 

Low but significant correlations were fodnd between Tl of the 
SOST 'Items and student achievement in an introductory psychology course. 
In all cases, the signtf icani^correlations indicated a positive " 
relationship between student ratings' h.xh achievement. 

These- finding^^re ta4t|as eviAice th^'the 'instrument possesses ' 
a certain .degree oiF criteriorf-related validity for it argues that 
ins true tor s^hD receive favourable ratings ai^e more "successful in 
facilitating learning, among Ahei r students than- instructors who-- 
recenve-les's fayofurable ratings,. / 4 ' • " . * 

4.14 Factor Analysis . / ' . ^ , 

Five factors emerged f rotn' the factor analysigfprQCedure. • T^ese 
factors accounted fgr approximately 55.% of the vartance i^n the item, 
responses. In general , factor ^loadings were moderate. to low and. the 
interpretation of the factor structure was complicated by a significant 
number of cros|-loadings v , . . . 

The five factors were identified as follows:. . " 

■"^ctor I - Instructidnal Skill (Item)u»^ jn, M, 12, 18, 19, ^ 
> ^- / 20, 3ncJ ^Thisis a general factor which "measures 

s instructors ^ab^lity to communicate with and moti- 

^ vate students. The ^rge number' of items indicates 
' ^^^.^ ^ a possible "halo effect./' ' , ' ^ ' ' * 

F^actoKJ^y-- Interaction (Items 17,, 20, 21 , 22, and 28). ^This 

' "("actor is. a, measure of rapport and the general level ' 
. of verbal and written exch^iges between students and 
. •'^ ' , the^instructor. - . ' \' . ^ ' 

, Factor Il'l - WorMoad (Items 25 and 2^). ' This 'factdp'is a me^^ur.e 
' . ■ ■ of the amount" of work required -in th^e course. 

- Organization (^tems'l2. 13. and 14). This factor is; 
■ ' • 3 general assessmeat of the Instructor '-s preRa^dness 

. ' V clarity in explaining course objectiv'es^and , • 



/• 

e 

i 



y 



requirements. * " ' - # 

Factor V r Feedback (Item 24). This factor maasures the extent 
to which the student is able^ta assess his progress 
:' and Jtch-ievemeat in the course. " 

4:1^ Effect 'of Student Variables on Ratings 

# vThe resaiti ofa series. of analyses of variance indicated that: 
a) the'' student's major ('faculty affiliatiofiTvJ^ 

evalu|itions of courses an* instructors 

/ ^ ' b) upper- T^vel and 3raduate-l€ve1 stiidents tend to rate instruc- 

/ ^ "^^^ ^a^ourably than lower-lisvel students ^ 

' -''m^^^^ feel that their e^rformance is "superior*' or 

^ ' ^^l^'aboye average" reUtiye to' others in the class tend to give * 

* their instructors hefeter r^tsij?gs\ 

' ' ^5* el^ctiye^ ^Q"f^^gjl^^ ''^ted more favourably than required 

' courses L ' ' ' 



.e> students who report that their effort in the course was 
"excell'wt" or "above average" Relative to 'their effort in 
other courses rate the instrtjctor and the course more favour- 
j ably than other students. 

. 4' 16 Effect Of Ins-trux^toV Variables fl(h'^Ratin<|s 

• Analys^ of variance indicated that, in' several cases (5 items),' 
. senior faculty members (Professor;. Associate Professor) are evaluated 
more favourably than junior faculty, members (A^istant' Professor; others) 
Furthermor*e,.male instructors receive more positive ratings than female 
ins.trudtoV"s on 7 items. , ' • . * 

4-. 17 Effect Qf Ctass Variables on. Ratings ' • . 

A final secies of analyses of variance showed that small, and medi 
■sized classes tend to r^iye moV-e favourable ratings than large classes 
'(JO items) but that class Wet.ing time generally has no effect ;on student 
evaluations-.'- '. \. 



ERIC ^ 



7 




. ^ - ' 66- 

' 4>v CONCLUSIONS AND RECOMMENDATIONS " 

Although the §OST seems to possess some relatively positive 
psychometric qualities, namely criterion-related validity and reasonabl 
^^^''^'^^'^J'* the instrument shbuld not bc adopted in its present form 
without revision ind fuVthPr tP.tin^ particular concern is the 
. factor structure and the associated internal consistency' as well asl- 
student, instructor, and class variables which.Tn some cases, consis- 
tently affect student rattngs. , ^ 

Specifically, the existing subscale organization of the SOST does 
not accurately Teflect the factor structure of the iTistrument. This 
>s shown by -the magnitude of loadings within factors and is manifest 
in low Internal, consistency coefficients among 2 of the 5 subsciiles. 
* ' Furthermore, serious consideration mast be given to factors 
which afreet student ratings, especially those over which Instructors 
have little or no control (ex: class size, required/elective course. 
Instructor sex and rank). If major dlfcisions concerning faculty fate 
are to-be b'ased, in part, on student evaluations; then ratings must 
bfe adjusted to take these factor^.jnto kcqount. 

One «f the avowed^ujrposes of student evaluations is to provide 
feedback to .instructors who, wish to Improve their teaching effectiveness 
^rhaps this is the most important use of t^iese -ratings. Unfortunately. 
>ere IS considerable doubt whether items -stated in global terms (such 
"Th^instructor motivated one to pJt "forth a good effort") provide ' 
this feedback in sufficiently specific terms.-. ConTpare. for example. 
Items on.t+,e Sp^with the following items tak^n frtJm Murray's (1977) . 
Teacher Rating Torm : ' , . ■ 

me instructor: , 

14. mQves-back and forth jm front 'of class ' ' , 

17. asks students questions during lecture ' ' 

20. addressr^s Individual stu(i,pnts by name 
23. maintains eye contact with students ^ 
31. gesture's withjhafids and arms .>^h1le- "speaking 
• 'Items such as these which are based on specific. observablP ^ 



•67 



teaching behaviours are considerably more useful to instructors and' 

at€ probably more reliable since students need not make inferences 

concerning the instructor's motivations or general abilities. 

Based on- these 'and other considerations the ffillowing recownert- 
dationS;are forwarded: - • , • ' . 

Reccmmendatlon 1: If the existing instrument is to be retained, 
the following revisions should -bj considere^: 

a) The existing subscaie organization of the S OST should be 
dropped in favour of either randbm ordering of items 
without subscaie headingsor use of subscal es. which 

'7 reflect the factor structure of the instrument (Instruc- 
tional Skill, Interaction, Workload, Organization, and 
Feedback). n 

b) Items 15, 15, 23, and 27 should be omitted. These items 
.have reasonably low factor loadings and therefore tend to 
obscure the significance of individual factors., 

c) Items 12, 20, and 21 crossrload significantly on two factors. 
These items should be reworded or deleted. 

,d) The Likert scales associated with Items 25 and 2^ should be 
reworded so that they are consistent with other items 

(ex: strongly agree, agree ) . 4 

e) Other insti^uments should be examined to find additional and 
appropriate replacement items within factors (Appendix D) 
» f ) . An additional subscaie containing items on grading procedures 
shoul-d be added. . 

* ■ 

^Recommendation 2: S.everal other existing instruments should be 
reviewed as possible alternatives to the SOST (Appendix D). The following 
instruments were administered to students who were-enrol led in the second 
semester of an introductory psychology course j^sychology 115b): 

a) Murray's "Teacher Rating Form (1977) ^ 

b) Educational Testing Service's Student Instructional Report ' ' 
(1975) ' . . ■ . . 

c) Kansas-State .University's IDEA Survey Form (1975') • ' 

d) Purdue University'^ " Cafeteria'! Instructional Rating Form (1975) 



ERIC i ^ 7;; 



68 



Correlations between items- on each of these instruments and 
SOSI items are giVen in Appendix D . In addition to the instruments 

• ^'^^^"^ Prey's Instructional Rating (Prey, 1973 and * 
Appendix D ) should be given seriou^-consideration. 

The advantages of these insj|uroents over the SOST are several. ' 
As was mentioned earlier, Murray's instrument is based on specific, 
observable behaviours- and therefore ma)«#brovide more infonnative 
feedback to instructors. It may also be q^^e reliable. 

The advantage of the .ETS. ^nsas State and Purdie instruments ' 

• IS that a considerable atnpunt^^f normative data is already available 
based on classes in- a wide range of academic disciplines, class sizes, 
instructor ranks and so forth. Use of these instruments and their 
normative scales would redu'ce the problem of adjusting for differences 
in student, instructor and class variables. 

Finally, items of Frey's Instructional Rating Form have been * 
showa^correlate very well with student achievement (r~ .90). This 
impliesNalidities considerably better than the SOST.' In addition, 
factor loadings on each of Frey's 7 subscales are consistently high 
(approximately .90). 

Recommendation 3 :. Further studies should re-examine the effegts 
of student, instructor, and class Ntarlables on ratings. Before any 
.rating system is institutionalized, a method of adjusting ratings for 
these variables must be developed. 

Re commendati on 4: Further studies should examine differences 
among ratings in the various departments, schools, and faculties 
of the University to Setermine how best to adjust for differences in 
academic disciplines and instructional styles. 



69 

BIBLIOGRAPHY 



. Aleamoni. L.M.. and Graham. M.H. "The Relationship Between CEQ "Rating 
and-Instructor's Rank. Class Size -and Course Level." Journal 
of Educational Measurement n(9), (Fa^n 1975): 189-202. 
Aleamoni, L.M.. awl Spencer. R.E. 1973. "The Illinois Course 

Evaluation Questionnaire: A description of its development and a' 
report of some of its results." Educational and. Psychplogical 
Measurement 33(31: fifio-fift^ ^ 

Aleamoni, L. M., and Yimer. M. 1973. "An investigation of the . 

relationship between colleague rating, student rating,. research 
productivity, and aca'demic rank in rating instructional effec- 
JoMrnal of Educational Psy chnlngy fid- 274-27^ ■ 

American Psychol og-ical Association S;-.andards for Educational and ' 
Psychological. Tests. Washington. D.C.. APA. 1974." 

Anastasi, Anne P|j^cholo£ic^^ ed. New York. MacMillan 

Publishing'Co. . 1976. 

Bauseli; R.B.. Schwartz. StanVey.'and Purofit. Anal. "An Examina^m 
of the Conditions Under which Various Student Rating Parametel 
Replicate Across Time," Jcuirnal of Ediicational Measim^J - 
12 (Winter 1975): 273-280 • ' 

Bendig. A.W. "A Preliminary study V the effect of academic lell. 
sex. dnd course var;Mables on. student rating of psychology 
instructors." Journal of Psychology 34 (19.5?^ 2-126. ' ■ ^ ' 

Bendig. A.W. '"A factm- ana lysis of student ratings of psychology 
instructors on the Purdue Scale." Journalof Educational 
Psychology (1954.): 45: 385-393. _ ~ ^ T 

Bousefield. W.A. "Students' ratir^gs of qualities considered desirable ' " 
in college professors." School and Society 1940. .51- ?«;^-9«;a . 

Bresler. J,B. "Teaching effectiveness and government awards." Science. 
1968. 169: 164-167. ' 

Brown. William. "Some, experimental results in the correlation of mental 

abilities." British Journal of Psy chology -3, 296-322. * <s 



ERIC ^> h 



70 



Butts, David P. and l(*n.^,R. Capie, "Evaluating teachers using teacher ' 
* performance." Paper presented «t.the 50tfi Annual meeting of the 

National Association f or Research in Science Teaching . Cincinnatt, ^ 

1977. 

Centra, J. A. "Self-ratings of^ol lege teachers: a comparison with student 
•'^tif'Ss". Journal of Educational Measurement 10(4) .-(Winter 1973) : 
287-295. , 

Centra, J. 1974. "College teaching: Who shoul-d evaluate it?" Princeton 
(N..J.): Educational Testing Service. 

Centra, John A. "Colleagues as Raters ^f Classroom Instruction," j'burhal 
of Higher Education 1975. 4fi(ll: i?7-^^7 ^. J - . - 

.Clinton^^ R.J. "Qualities college students desire in college instructors^" 
School and Society . 1930, 32: 702. 

Cohen, J., & Humphreys, L.G. Memoraridum to faCliUy, University Of 

Illinois, Department of Psychology, 1960 (mimeographed).. , .- - 

Costin, F. "Iptercorrelations between students' an'd cVJurse ^chairmen 's 
^ratings of instructors. University of Illinois, Division'of 
-Gen^l Studies, -1966. • , ' * 

Costin, F. "A Graduate Course in the Teaching of Psychology: Description 
, .and E*^aluation." Journal of Teacher Education . 19 (1968): 425-432. - 

Costin, F., Greenough, W.T. , & Menger, R.J. -"Student Ratings of College 
Teaching* ReliaWlity, Validity, and Usefuln^s. " Review of 
Educational -Research . 41 n97n. 5n-R.lR.. 

Creager, J. A. "A multiple-facto^JiHWtT^^is of, the Purdue Rating Scale for - 
Instructors" Purdue University Studfes in Higher Education, 1950, 

70: 75-96. . " • • ' ' 

' - . - ■ , 

Crittenden, Kathleen S., Norr, Jarne^s L., & LeBailly, -Robert K. "Size , • 
of University Classes and Student Evaluation of Teaching.* 
Journal of Higher Education 1975. '46.r4h 4fil-d7n ^ 

Cronbach, Lee J. . Ess^tials of Psychological Testing .- New Yovk| N.Y.: 
Harper &-Row, 1970. " ,, ' 



I 



ERIC 



Deshpahcfe. A.S..,A/ebb. S.C. & Marks. B. '"Student perception's of Engin- 
eering instructors behavioafs and theit-Velatiohships to the 
evaluation of instructors and courses'." .American Educational , 
Research Journal . ^ 1970, 7:^89-305. * V ^ . 

Downiie. N.H. "Student evaluation of faculty.'" Journal of Higher ^ 
. Education 23. (1952) : ' 495-496. 

« ♦ - • ' 

Doyle. , K.O. Jr.. 1972. "Construction and evaTuatTon of scales fqr rating 

• Ins^tructors," Dissertation Abstracts International, 1972,* 33 (5-A) 

2163^ ' . ^ \ "' / 

'Doyle, Kennefh 0.^ Jr. Studgnt Evalu(t:on of Ins truction.. - I pV;nr,tnnr " 
. Massachusetts: D..C. H^ath and Company, 1975. 1 

Dru5^er, A.J.. and Remmers, H.H. "Do aluttm'i and' students differ in their 
, attitudes toward instructors?" Purdue University Studies in • 
Higher Education . 1950; 70: 62-64 ^ 

Educational Testing Service. Student Instructional Report Com p a rat, ivp° 

Data Guide 1?75-l976. ^TS Col I'ege «nd University Programs^ Box ■ " 
■ 2813, Princeton,' New Jersey, 1975. ' 

Elliott, D.H. "Charac,teristics and refttionshi^ of various criteria' 
- ■ of^college^ and university teaching.^ Purdufe^Uni^cr^ity Studies in 
Higher- Education .- 1950, 70: 5-61. • ' • • 

Elsmore. Patricia B.,"and Lapoinfe, Karen A. "Ef.fe(«ts o'f Teacher Sex " 
\ and Student Sex 6n the Evaluation of Col lege -Ipsti^tors" 
^ourna) of EducationalAPsycholoqy . 66, ,(t974): :^6-3a9, 

✓ - ■ , 

Frenqh, G.M. "Col'lege students' concept of effective teaching determined • 
by an^analysis of teacher ratings." Dissertation Abstracts, 1957, 
17: 1380-1381. ■ \ . ' ^"^^ \ 7^ ^' 

Frey; P.W: "Comparative judgement' seal ing of student coijrSe ratings." ' 

American E ducational Research Journal . 1973, "10: 149-154. , • 

Frey,,P.W. 1973, "Student ratUs of teaching: Validity of several rating 
factors." Science. 182:V3-85: ' .' " • ' 

Gadzella, B.M. "College student views and ratings of an ideal professor." 
^lleqe and University . 1968, 44: 89-9^.. 



/ 



Gage, N.L. "The *ppr«is*l of college teaching: An analysis of ends 
and means." Journal of Higher Education . 1961, 32: 17-22. 

Gessner, P.K. "Evaluation of instruction", ^^^ce 1973, 180: 566-570.^^ 

Gibb, C.E. "Classroom behavior of th€ cbllege teacher." Educational 
arid Psychological Measurement . 1955. 15: 254-263. 

GulLiksen,' Harold. Theory of Mental Tests . I^ew York, John Wiley & Sons. 

•Inc.. 1965. - ' 

. ' .«^ . - • — ^ — ♦ . 

•Hartleyi E.L., i.-Hogan, T.P.» "Some additionar factors in student 

evaluation of course^." American Ed ucati4^|| Research Journal . ■ " 
1972, 9: '241-250. ' \ " 

Harvey, J.N., and Barker, D.G. I'S'tudent Evaluationx)f Teaching 

"Effectiveness." ImprAng College and University Teaching , :i976. » 
18: 275-278. * 

Hayes. J.R. "Research, teaching .and ffelty fate." Science 1971./ - 
. /1?2: 227-230.'* . , „ ' " . ' 

Hildebrand, M..- Wilson, R:C., and Dienst, E.R. 1971. Evaluating unfversity 
teaching . Berkel ey: ..Center] for Research .aj^d Development in Higher 
Edxi^catioh, University of California , Berkeley. 

^DEA.- The Instructional Development and Effectiveness Assessment System 
Certt^r for Faculty Evaluation and Develdpmen^: in higher Education, 
1-627 Anderson Avenue, Box 3000, Manhattan, Kansas, 1975.' 

Isaacson, R.L. , McKeachie, W.J. ^ Milholland, J.E., Lin, Y.G., Hofeller, 
M., Baerwalbt, J.W.:, & ^inn, K.L. Dimensions('6f student evalua- 
^ tions of teaching. - JourrfflU of Educational Psychology . 1%4, 
^ 55: 344-351. . • • ^ 

% 

^ / . . _ 

Kennedy. W,.R;" "Grades Expected and Grades Received - Their relationship . 
to- students'-, evaluation of faculty performance." Journal of 
. of Educational Psychology . 67 (1) (Feb. 1975): 109-115. 

Kohlan, R.G. "A comparison of faculty evaluation? earTy and late iri\ 4' 
the course." Journal of Higher €ducation o 1973. 44: 22-25^ 



* : ■ / ^ ■ . ' 

% - . . / ' . ■ • . • . 

Kqpker, E.W. 1968. 'jj|he re-fationship of-cofl^ge grades' to course 
^ratihg*^ -on studenjt selected It^ms. " .• T he Journal of Psycho Togy . 
/69: 209-215. / ' ' . ' 

Kulilc^ -James A. .^and McKeachie. wl?ert J. "The Evaluif^ipn of Teaqhers- ' 
. ■ in Higher Education"; Ip .RgvjeW of Research it^ Education . ^ ' 
' jf Itasca, niirroi^. F.f^feacock Pufelisher^. inc, ^75. 

,\-L'angen, T.D.F. ilStudenT assessment of^teacKin^ effect! veness . " Improving 
College and Univer^^ity Teaching . 3966, "Vli: ' 22-25 , ' . ■ 

..LovelfcD. VHaner, C:'R "For^ed-rltice applied to college faculty 
•"ating." Education and» Psychological measurement. 1955,- 15: . ^ 
.291-304, - '/ ' . ' ; ~* 



Masl 

m ' 



and. Zimmerman ^^'W.- 1956. >"CoTlege t^ctring ability. \ 
so||larly activity and personality." Journal^of^ Educational 



<^:\ Psycholog y. ■47: 185-18q'. ' ' ■» * '■ \ 

McDaniel, E.-D.. •andf'^eldhuffen. J.F.. ° "RelationsMp.s betwe^ facu] ty - 
ratings any ir>de3<es of ^erv^cie and scho>arsh'ip." Pr oceedings . 
^ 78t«h Annu/l Convention APA. 197.0. 'W9-52p. ^ 

-tlcD^j^ ; & mdhuaen- J.F..' "Colfege .teaching effectiveness. " * 

' - ' Today I s Educa ti on ^ 1971*. -i^n : 2f. . ^ ''.-^ ' " "~ r ' ~ ' 

McKeach.ie^. W.J.\ Isaafc'fof^.' . and Mi 1 hoi land'j.*1.964/ "Research on 

characteristics o.f effectiA/e cojl"e^e"teaching." fih^^-report* 
^PQratiye ResearQf^ Project Np. 6e850, Office • of Educatjo* ' 
9 Department, of 'Health. EdKfcatioa a*nd W^lfafe.' Ann Arbor ' ~ 
^cKe'achie..W,Jv. & Li];r..y;6/ Multiple (lisc^iminant analysis q^ ' t - 
./* ^^tudgor ratings o^col lege- teachers Uhpubllshprf m^nM<r^\^\) ' 
■ ''Uni'l^ersity of Micln^n. 1973. ' ^ ' ' * ' 



McKeachie, -W.J. , Ain, Y. , and' Mann, W. . "Stnd^W^N'a'fingl of feacher ' 
effectiveness: Validity studies'. " ' .AmgTf^i caik Educat>bna1 Research' ' 
Journal, 197T, 8: 435-445. ' ' " ^ \ 

Miller, R.I. : Developing Programs fpr Faculty ..Eval uation. San Francisco 
*^ • California, Jossey-Bass^ Inc. , ♦1J74. . * ' , 

^^^/x Evaluating FaculW Performance San Francisco* Jossey-Bas's 

* 197?. . ' ■ 

Marsh, H.W. , Fleiner, H. , and Thomas, "C.S. * "Validity and Usefulness of 

Student Evaluations of Icstructional Quality." Journal of 
^ Educational -Psychology % (6)! (1975) : .833-839. ^ , 

Morsh, J:E., & Wi>^, E.W. "Lclenti fyi ng the effective irvStructor : • * 
' . A review of the "quanti tati ve stijdies, 1900-IQ52. (Research Bull. 
■ ,\^RPTf?l-TR-54-44);. Air For<:e Personnel' and Training Research , " 
Center, 1,954. 

.Marsh^J.E., Burgess, G.G. and Smith, P.N.^ "Student 'achievement ,a.s 
, , a measure of .instructor effectiveness. '.' ' Journal o.f Educational 
Rsychojogy , 1956, 47: 79-88. ' . 

■ ■ ' • , • ■ * ' ' ' 

Murray," Harry' G. A Guide to Teachtrig Evaluat ion. Toropto, Ontario 

Confederation of University Faculty A'Ssociat*ions, -1973.' 

Murray, Harry^. "Lecturing-Clas^room Behavixujrs of Social Scienc'i^ 
Lecturers Receiving low, Medium-and High Teacher. Ra t i ngs. " 
8UPID Newsletter 14 (TebruarV 1977): 3-5. ' * . .■ . 

.Perry, R.R. Eval uatio» oT teaching bahi^vibur seek^^ measure effec- 
tive/iess.- College and University. Business . 1969", 47: 18-22 

Pogue, F.R., Jr-. 'I'Students' Rat^jigs of the -.'Ideal Teacher' ".'''^^ 

Improving College and Universi ty^eaching , 15 (1967): 133-136. ' 

Pohlmann,. John T., "-A Description ofMMc^ing Effectiveness as 

- Me^ured by Student Ra-tings." Joui^^al of Educational Meas^re- 
^ - men t, 12 ^SpriTi'g '19?5) : 49-54." 'V,, '\ ' . 

Purdi/e (Viiyer^ity ""Cafeteria" Fnitructional "^ting Fx)rm. Purdue • / 
Research Foundation, West. Lafayette, Indiana, 1975. ^» 

Qfier€rslhi,^M. Y.t. "Teaching .EITectiveness and Research Productivity." 
I Sci ence , 1,968,* 16l': 1160. 



r 



Rayder, N.F. "gonege student ratings of instructors." Journal of 
Experimental Education . 1968, 17: 76-81. • , 

Remm^rs, H.H. "Ap^praisa,! of col lege' teaching through ratings- and %tudenf ' 
^ opinion. "In 27th Yearbook of t^e National Society of lUlege 
Teachers of Education. Chicago: University ;^f Chicago Pr^ssv 
^ M939. <■ ' * 

.Remmers, H.H.- Manual of 'instructions for the ' Purdue -«at*ng Scale»for 
Instructors. (Rev.* ed.) Wesfciafayette, "Ind.: UmversitV Book " 
Store, I960.' • 
■ ' ■ / . t? . ' 

.•Remmers, tt.H..,.and Brandenburg. G.C.- 1927. "Experimenial data on the 

. Purdue Rating Scale for Instructors.'.' ■ Educational Administration ' ^ 
and Supervision , i:^: 519-527. ' , ~ ^ . ' 

^Remmers, R.H.., Elliott. -O.N^ "The Indiana College-ahd University Staff- 

-.Evaiuati oh Program,^' .Schpo'l a'nd.Sticiety . 1949, 70: -16a-171 - - 

Remmers, H,H., Shock, fT.W., and Keiley, E.L. "An empirical study of 

the Spearman-Brown Eormula as eipplied to the Purdue Rating ScaTe.- t 
• v" ^Q^"a1 of . Educational Psycholog y'. 1927, 18: 187-195^ 

•Rembiers," H.H; & Weisbrod^ J-.A. Manual of Instructions for the Purdue " ' ' 
-Rating Scale for Instructions-. W<^t ■ f ;^f;,ygVtP ,' TnH.-.na • Universitj^ 
Book:st^|ne, 1965. ' , . ' ■ 

• Rodin,, M. and Rodin, B. 1 972. ^Studenjt evaluations of teachers." Science 
1977: 1 164-1166." , . - "^"^ ' " ' , . ' 

■ Root. A;R. 1931. "Student Rating? «f. Teachecs. 'journal of Higher Education . 
. 2, 311-315.. • > 

Skan^s. S.R., and Sullivan, A.M. "Validity of Student "e valuation of' 
\ 'Teaching and the CharacTer of Successful Instructors.". Journal 
' Pf Educati onal Psycholog y .'^6 (4), 584-590, r974. 

Smalzreid. N.T., & Remmers, H.H. -"A factor analysis "of 'tjjp Purdue Racing 
Scale for Instructors. " Journal, of Educational Psychology . 1943, 
34: -563-367. . , , t . ' ^ 

Solomo'n, D.' "Teacher behaviour dimensions, course characteristics^ and . 
-student. evaTu.4ti(in% of teachers, '.■• American tiucatiorral .Research 
. -fiourrf^,. 1965, .3:. .35-47. ' ' ■ \ .V 



Spearman, Charles. "Cyrelation calculated wl'tK faculty data.'' . • 
IritTsn Journal of Psychol OQv . 3. ?71-295. - ' - 

Student Evaluations Coraniittee. "Report .of the Senate Student 

Evaluations -Conmittee." Faculty Senate, University of Windsqr, 
Decenter 16", 1975. . 

Jhomdike, R.L. ^Personnel Selection . New York: Wiley, 194?. * 

Turner*. R..L. "Good teaching and i'ts contexts." . Phi»0^1ta Kappan . " ' 

1970, 51: 155-158. , . " " ' . . ^ ' 

Voeks, V.W. • "Publications and teaching^effectiveness . " Journal .of 

Higher ^ducatfdn . 1962,. 33: 212*^ - , ' % ■ 

Voeks. V.W. . 4 French, t.M. "Are student ratings' of teachers affected v 
. by grades?" J ournal of Higher Education . 1960. 31: lin-^^d 

♦ 

Wherry, R.J, ig|2.- "Control of bias in ratings."^ Department of the' 
Army/The Adjutant Generi»rs Office. Personnel Research and 
- Procedures Divis.iQn, Personnel Research Brwich. P^S Reports 
914 , 915 . 919,- 920 and 92if*^' ' \/^ ' ' 

Winer, B.J.., Statistical Principles in Experinenta'l Design . J<ew York, 
^ N.Y.^: McGraw-Hill Bqok Company, 1962, 1971. • • ' 

Wood. K. , Lertsky. A.S..' and ^rauss. M.A. "Class. Size and Student E\Ialur 
l^ ation. of Faculty." Journal of Higher Education ." 1^4. 45: 542-534. 



I 



77 



V 1 




APPENDIX 



■r 

OF 



THE STUDENT OPINION SURVEV OF TEACHING (SOS#. 
'. ' ■ , ■ 'AND . ' / • 

INTERCORRElATIONS BA&E'a ON 2229 ' STUDENT RE^ONSES 



\ 




STUDENT OPINION SURVgf OF TEACHING 

-1 \ ns ' — 



78 



PART 



I General Infor— t^yn / *' ^ /| 

\, >^ Mjor is in: Arts Social Sclttce - Science | Math lusinfess' Other 
^ This coursi is part .of ay honoura/general prbfcria. 

^3. I have collated the followinf niab*t of University, level full courses (two^haif 
courfs equal one full): 3--7 * 8-12 13^17 18-. 



4. Rating, iD^elf againignhe ^performance .pf other students ia the class, I a^ee ByselfH^ 
in one df tJye follt>wing grm^Ti superior . abov# average , average, below average, ' 
iHiini- . A ' — '-T^ • b 

5. This course wa5 compulsory. YES,' NO *. NOT SURE \ ^ • 

A B \~^C 

^{^ff attendaj^ce and punctuality have been insistently good. YES NO 

T ' A B . 

7. Co^ared to other cdUrses I hav^ talTen, I consider ny effect in this course to have 
been: excellent , above average , average , >below average, poor 
• A B C ' ^ ^ E 

* ' 8. t hav# found the material in this' course "to be inheren*iy difficult * YES 

• ' . ' . -B J^/ 

PART • AU FOraSyG QUESTIONS ARE RATED ON A FI\t-POINT SC^LE FROM STRONGLY AGREE TO.STRONGLY 
II WSAGREE EXCEPT. WERE NOtED. ^ 



Section A . C amunitation (Ins^hiCtor - Gron) -Interactiory) 
9. T\eijistructor i* clear and^audible. * * 



Not ' , ' ^ Strongly 

. «. . Sure Bisag^e ' Disagree 

B • > D — E — 



10. -me instructor presented material^ a coherent manner, empha^^ing najor points 
* and' aaJcing relationships clear. ' 

"^Strongl/; . . ^ Not 9k 

Aa^A* A rt'm^ - ts^ - ^^^^ 



Strongly.- 



Agree , K^t - ajre • Disagree - D'lsagVee . , . , 

. . - B ^ . ^ f D ' ' £ ^ . * 

11. Course ^ateri-ffl -was, diserg^ni zed and^hindeyed laiderstahding , • * ' ' \ , . 

\ V^^o^g^K * ^ . Not ; / , \» Strongly' . , . ' 

^g^gg . ' Sure Disagree Disagree — 
^ B* \ C D / . ^ ~^E . 

^2. The instnictof was consistently prepare?^ for class. . ^ ' ^ 

-Strongly , " , "Not ' ' ' ^Strongly ' 

^g^^ ^grec a Sure ' Disagree , \ -Disagree ^ • ' - 

, A. ^ ^ ^B — ^ T?^* * ^ • . ' ' 

' '^15. The' instructor wa$^ cleir on what tras expected Tegarditg/cours^ requireaent Jr . ' 
^assignments, ex^s, etc, * ^ ^ \ - ^v, . ' *' 

i ' Stropgly ^ Not ' ' ' , y Strongly' ' 

i^Mil , Agree ^' jure Disagree Disaigree % 

n ^ , . S • - " / ^ — E— 

, ^ 14. ^The i'nstructoT's' attendance and, punctu^lit y havr been cons'istently good 



Strongly * \ ^ . Not ' w ' -Strongly , 

Agree ^ Agree * Surfc '' Disagr ee Di'sagree - ^'^'^ • 



Section B . Coinaunicationr ( Instruct or ^> individual Interaction) ' ' ' . ^ 

^ 15. .The instructor, encouraged and readily responded, to student questiofi|. 



/ 




. ^ 

,Strcrgly 'Not , ^ Strongly . • 

^^•^ ' Agre^ - Sure Disagree Disagrw ^ 

^ * B ' 'IT- - dI • E ) 

16.. The instructor has not been readily available for cmisultation • by appointments 

or otherwis* . 7 ' t , ft ^ * * • " 

* Strongly ' . ' Not / , Strong;iy % 

^ ^^^^ ^grg» , . , Sqrm Disit^'ree ^ , Pi sagree > 

• A ' • ^ B • C , Jn^ . 0 ^ B ^ 



" 1.7. The instnictor aaintainod t gonertUy helpful tttitude. toward studcn'ts md tWir 
problems. 



Strongly 



B 



^Not 
*Sur« 



Dlitfree 

D 



Strongly 
Distgree 



Section Motivation md Impact 

18.' Tl)e, instnictor made this cour^ as. inter^tintf as the subject .atter would allow 

/ .'l^S^' ' ' -^t ■ ^ ♦ St/onglx 

• A^ree Sure . IHsagree • Disagree 



B 



19. •n.e instructor did not increase mr^ interest in tha subject wtter of the course 

Disagree ^i^^^i * , 



B 



20. -n^ instructor motivated ae to^>put forth- a good c'f^^l. V 
' Strongly ' . • , Not ^ 



Sure 



Disagree 

D 



^ . B . "c 

« 21. -Hie instructor was successful in Wmg cTiff i cult « Axial understandable. 
Strohgly , ■ 

_A^£ee_^ Agree. f Sure Disagrtse - 

^ . • .C D 

Section D v Feedback . ^ 




Strongly 
Disagree 

E 



Strongly 
Dis agree 
E 



\l' Verbal o^wr^t ten; comments on assigniDcnts have been constr\ictive . 
Stronplv , ' /Not ' ,* . 

r ^greg * Sure Disagr ee 



Strongly 
A 



23. The evaluation systein for this course was f air ly^appUed*. 
Strongly • ^ j^ot 

^^l^^ ' ^S^^^ Sure -Disarree 



Strongly ' 
Disagree/ 



Strongly 

' E 



24. Throu^h^t this course, I have-not been able to a*s*ss iiy progress and achievement 

C» 1 tfl • . ' . 



Strongly 
' Agree 



Agree 
B 



Not 
Sure 



Disagree 



Strongly 
Disagree 
E 



h'^;h!' "ve^M^? ^^^^^i^^'^^ ^''^ ''^^"^^ performance ' very low , low, average, 
* ^< ^ tion E . Standards 



) 



i 
w 



* * * • * ' \ ' 

2*7. The material coverfd in this course has been beyonj^i^y* previous a.cademic expe^ehce 



St rongiy 
Agree. 



Not 

Sure. f pis agree 



28. The assigni»«nt3 provideji a vali!mble learning experience, 

'^trongiy: . * Not * ^ " - 

^g^^ ^ ' " ^gree Sure^ Disag ree 

c — — • 



Strongly 
Disagree 
E 



T 



^. Strongly 
agree 

* • . ' • • -I ^ 

\ ' • ^ 

«a^5 <thi's questionnaire giWn you an adeqi/ate opportunity to exTirass your% 
opinion about the. inst7t|^ion in this course'' 



YES 
A 



NO 
B X 




ITtMO^ . 
TTEI^IO 
iTfcWl I 
iTcMl 2 
ITPI-I 3 

I TE«l = 
iTEMie 
U^mi 7 
TTfeMl e 
ITEMl « 
I TEMZC 

1 re"*? I 

,116^22 
I JEm^I 
1X^24 

I T«^M2« 

I Te>-2e 

ITc#<2? 



I "^FMC^* 


* 

I TtMlO 


1 

IT^M . ) 


TT. M12 


1 .£0000 








0 . o 6 c 5 4 


-0. :i'>Hb2 


0. 3 9 73r 


0 • 58654 


1 .CQUOO 


- O^ilc^fJ. 


L . <► b S 


-0 • 3646 2 


-0 .5ua9i: . 


1.00000 


-0.36140 


0 • 38 735 


0 •4b«i68 


-0.3^1 40 


I .00000 


0* 3€44f 


C .40^07 


- 0 . 34 i 42 


C C4 ^4 




0 . c 2 o2 7 


- C. i 3C66 


C .J 7&9C . 


Q . 3t see 


0 .4 Q777 


-C .%)£e7 


-C. 36 7f<" 


-0. le^992 


-0 .c0757 




□ •34692 


0 . jd97e 


- qy|4o^5 


0. <?7896 


0 • 4ab9f 


0 ^5 4 99 4 


, -oSi40073 


0 .A 1456 


-0 . 35 95? 


-0 ,4 I 737 


012 3177 


- C. L|f!59; 


0 •34O02 


0 .4 i 711 


-O/. 2547 0 


0.^2117 


' 0.45396 


- 0 .59 95 3 


-d.^ >fe5 0 
-cL I 51^,6 


0. ?^905 


0 . 2087 0 


0,24377 


0. i 5 9*^1 


0. 1 9t 1 3 


0 .25106 


- q.2^0 I 7 


'C . . 95 3 , 


-C . 21 1 8C 


-0 .20933 


0.27600 


-0. 1553 1 


-0 • C30 1 6* 


. J43i;l 


- i. 0C60I 


-0.0C777 


0^ 04 321 


0 .0«*C8o 


Yo . C i i 1 0 


0. 0477 3 


o.coe 37 


-0 .02943 


>f. C3 994 


0.021-09 


0. 25 351 


C .26370 


.-C.22 5«;3 





I Te«09 
. ITEMl 0 
ITFWI 1 ^ 
iTeii'I 2 
I TEMl 3 
ITeMl4 
ITEMI 5 
ITEMI 6 
I TEMl 7 

I TEH20 
I T^Mi 1 
ITEM32 
ITeM23 
I tEM24 
, tT£M2^ . 



-0. 35956 
-0 . 41 737 

0 . 331 T7 
-CV2o59 I 
-0 .292^1 
. -0. r4> 8 4 
-0 . 34 36 6 ' 

0.27407 
-0. 3461 9 
-0.5"094C 

1 . 0000 C 
-0.47904 
-0.42646- 
"<2-Ji,0^4. 

-0 .-lee) 6 9 

C. 13*^1 2; 



iTc^'dO 

C .34002 
0 .4171 1 
-0 .25^470 

0 . :i 1 17 
X) .30:>46' 
TO If 59 

' 0 •37769. 

i^O .22478 ' 
* 0 •^c59. ' 

' 47-^04 

1 .00000 
0.51166 

.0:3^790 
0,2d«l 
-0 .12692' 
-0 iC7t»44 



0. 4539^ 
0 . 5 3953 
-0 2550 
C . 15CC 5 

0 . 3 > ee I 

0.21 01 5 
0. 4 J331 
-0 .-i3080 
p. 42528 

-C.43e«i-6 
-0 . ? 1 1 8 o 

u cocoo 

<i . 3^ 7 ^0 
0.29 92 3 
T^. 1 7573 
-0.03204 



I T. M22 

C.2 C870 
0.^43*^7 
-0. I 51 9f 

0. 1 5 ;gi 

O.c 152 1 
C. i 5920 
26765 

-0. i0591 
0. J2975 
0.2:74r7 

- C. «r 1 0*4 
C. 35790 
Ofc i 37,90 
X . 0 OOt 0 ^ 

c.i 0400 

-O.i 09*'2 
0.03tS8^ 



TT6M26 
ITE¥?7 
I T£i»r2T 



"tt:^ 9 

0 . 07 ?1 0 
-0. 29 V> 1 



-0 <403 V 3 1 
a . J26 31 
\>»40ce o 



ITt£M?{l^ 

0 1 c 5 1 e '-^ 

-C . 01 1 

0. 3d^ t f 




0. >653S 
C. ^2299 



EKLC 



■r 



■0 .3i><*^ 6 
0.40607 

-0. J4142 
0.*^49«> 
1 . 00 0 0 k 
0.*:61 0.-^ 
0. J37 1 4 

-0.*2 0t7^, * 
0. 34 y 
^.4745 8 

-O.*;92ol 
0. J0S46 
O.35&0I 
(3.<ii21 . 
0.iJ0*:37 

-0.2 j2t/t 
0. 1 6f' 

0 . O^i. ^tL 

-0 .0i574 
0#*:5i67 



C . 1 ^988 
0 .23827 
-P . 1 8066 
0 . 37890 

0 . 2 o 1 02 

1 . COCOC 
C .26014 

-0 .1561 1 
0 .25929 
0.21 006 
-0. I46&4 
0.21^59 
0.21015 
0.1,5920 
0. 09693 
-0 . 06854 
-0.0-4135 
-0.01 354 
0.^4 87 6 
0 .14246 



'0- 



0 .36565 
0.4077*^ 
-0. 26*^«^7. 
2»262tt/ 
tJ .3571 4 
0.26914 " 
1 .00000 
-a.2652 I 
0 .564S5 

0. 43-^47 

' -0 .34 36 8* 
0.3776^ 
0 . 4 0:? 3 I 
^ . 7<^« 
0 . 1 ^! t>i 
-0 . 1 7] fj «^ 
-0 .069''2 
-0.011 28 
-0. Oai 67 
0.2 39A9 



-O . 1 69^? ^ 
-0. 3C*5-» 

0. 1 861^7 
-0.1 ^f-'P 
-0.2 O'^-'-^ 
-0 .J'rf^ I 1 
-0 .?^*2 1 

1 . ')Qoa.^ 

-C . ?377'^ 
-C^. 2 06 1 0 

0 .2 740 7 
-0 .2247-8 
- 0. 330^0 
-0 . ^«*"J 
-C . 1 r 4 

O.J* 31 ^ 

0 . C7 ! 

0 .0 379 1 
-0 .01 «^22 
-0. 201 0? 



4 0. ^46Q? 

0. 3807*- 
' -0 .2k40T*; 
' 0.278-0 

0 • ''4'^-} a 

0 . - »■ 
-C . ^ 57 'O, 

1 .00000 
0 . 465 ''7 

- 0 . ■'4 6 ^ ^> 
0 .42f.SO 
*0 .^125."^ 
0.-»2<^75 - 
0.221 1 :> 

- ^. 152T ? 

- 0. 00*^*^? 
0.0 I 701 
0vf^<^0'>3 
0. 2970«; 



/J. :» -^Sr- 

- 0^ *OnTH 
0 . 4 1 ,6 
0. ,-^4 - ^ 

• J . ■> ^ J 

0 • 4 < ?4 7 

-0. / J- 1 J 

0 . i^f">y7 
\ • ooor> J 

-0. 5^*94 0 
0.5^6-. 3 
0. =^4 • ^ 
0. '^^,77 
0 . "» 4 J I 
. I 8 » ? ^ 

-0.03 C > 

0 . 0573 1 

0. 01 328 
0. 3 126 : 



ITcM33 



I ItW24 



I TFV^? 



T7 F Ni?^ 



ITE«i?7 



0 .196 1 3 
J.4i51 06 
-0 .22-) 1 7 
0.19531* 
0 . JO 3 7 
y 3 

0.19163 
-0#16a56 
,0.^^X0 0 

0.*:4-'9J 
- u . A a o 9 

0 .26 33 1 

0 .299^ 3 
J.3 0^ >0 

1 .00000 
-0.2JI 55 

i).095! 7 



-0.'21 180 


-o.o'^oie ^ 


04 32 1 


0.00687 




-0.20935 


^ -0 . 045 ^ » 


0 . OA086 


-0 . 0204 3 


0^38370 


'C .2 7600 
- 0 . 1' 5 5 3 1 


-0 .co«»o 1 


-0. Cb« 1 r 


Of 039 54* 


- 0 . ? :> . ^ 


, -o.or.7-»7 


n .0^ ' 


0.^021 ') a 


0.1 Q->8 ■ 


^0.2^66 


3 .0?i6^ 


-0 .C3 2'^» ' 
-0 .1)1 TV- 


-0-. 02S74 


. ? 6 7 


•-0.06654 


-0 . O&l 35 


0. DAR76 


0. I h24c 


-0.17168 


- -0.06 9*^# 


.01 ' •>8 • 


-6.00 1 67 


0. '>39:*9. 


0.16316 


0.07)1.0 * 


0 . 0 3 •'.o J 


-o.oii?e 


-0 . 2 Jl 0 3 


-0.184^8 


*oofie2 


0 . 01 70 1 




0. 29705 


-0.03109 ' 


0 -C «= "» 3 i 


Qf.|^K8 


0»'3 3"»6c 


C. 29565 




O.Of *^(^ 
-(^ .03 1 V 


o.«K> 


'>9:0,1 


-0.1 


*- -0 . 0 A A 


0. 02^1 


0. * '>fl«k^ 


-C. I 757fl 




0 0 6 i 6 6 


-0.012'?' 


1 >:*7P 


-C . 1 0972 


0. 03b8fi 


0 . 06S 3^ 


0.022 


.4 ^'>o ■) 


-0.23155 


' O.JOQ«^17 


. 0.1513*: 


-0.0?5ll 


0.27135 


1 .00000 


0 . 0*^06 

1 .CTOOO 


0.0 ^''Q ' 


0. » 53 30, 


^0, 1 * 


* 0. 072^6 


0.2O'>84. < 


-0. 0 34 35 


0 . -> P, ^ T 



0 . iS 1 55 
. J<.b^» 1 
0.^7 3 



I7FI424 

"#0.0>79 1 
^0.1 o3-\C' 
^ 0 . 1 c 1 it 5 ' 



ITFM25 . 
0^0. 'J 3 



1 • 00000 

-0 . * ^6 ' ; ' 

0 .C*>a0 7 



T Tf l« 27 



164 1 ? 

0 • T77 ' ^ 



.9;_ 



r 



I * 



81 



APPENDIX. B - 
FOLLOW-UP LETTER' 



,/■■• 



ERIC 



■J 



> 



9,. 



Un i vers i ty of Wi ndsor 



Dear Prof. 



February 17, 1977. 



Thank 'you for^ agreeing to cooperate .in the validation of -the Student *0p i n ion Survey of TeachJ nq , As 
yoa probably know, this instrument was developed by the Faculty Senatfe Student Evaluations Committee and is 
being consi^Jered for un i ve^-s i ty-w ixle adoption. Dr.^David Reynolds and have received an O.U.P.I.D. Grant to " 
examine the reliability and valid^tty of The SOST and to recommend changes or revis'ions. A copy of the instrument 
is attached.'! - . , 

So t-hat we might coiiect data without- unnecessarily d i srupt i ng- your c|^s schedule, Md. li'ke to ask yojj to 

complete 1*he table shown below by indicating; (I) the course name/number • 

(2) date you wish the evaluation' • 

•(3) cj|)ss meeting time^ ' , 

^ V (4) meeting p I ace^( bui I d i ng and room 

(5) whether the eva'luator (research assistant) shou I d 'd i str i bute 

. * the Histrument at the beginning or a^,the end of the period 

(5;) apprpximate enrollment 

Depending on class size, the evaluation rs<^uires approxirtiately 10-f5 minutes of cl 



V 




1 1 me . 



Coui^se' 



-7^ 



Eva I uat I on 
Date 



Meet i ng 
Time 



Meet i ng 
Place 



Di str i bute at 
Beg i nn i ng/end 
of peridd 



Approx i mate 
« Enrol I ment 



All da+^ will be treated" con f i a<rrrt i a I I / 
can be provided. Please indjcate if you|^'«h 

i n for 



>i3nk you aga i 



your coTOeratron' 





f you wish, a complete printout of your own, eva I uat i on * 
a copy of your evaluation ( ^yes-; no). 



this to me by Friday, -February 2«^th . 



^ Joe I J Mi ntzes 
^As3ist%|t Professor 
* Departments of 8joiogy 



'CP 



9 , 



83 



i 



APPENDIX C 

"NORMATIVE^' ^DATA 
BASEffiDN 93 aASSES 



6 >^ 

ERIC 



L 



•V 



Item 



- 10 

n • 

12 
^.13 

14 
*■ 15- 

- -16 • 
^ 17 • 

18 

19 
'20 
. 21 ' 

• 22 ^ 

23 
V 24 
' 25 ~ 

26 

27 

28 



r 



"NORMATIVE" DATA FOR 50ST 
^ ; , (■N=9^'t§isSES) 



Mean 



Standard Devi 01.1 ion 
of Means 



Average 
Standard "^eviatiori' 



WB2 

2.1^' 
3.79 
1.93 
1.96 
1 .49 . 
1.7ift 
/. 3.69' 
^ 1..86 
2.27 
. 3.32 
2.67 
^..2.33 
2.43 
2.44 
^3.52 
3,46 
•/ 3.53 
• 2.82 
' "2.28. 







70 , 
. ''^ 




• 




.53 






53 






. H/ 


- 


. 0 0 


■?7 




. DO 


•30 ? 


• 










.37 




.70 


.51 




.87 


.48 




.99 


.•47 




.91 


.44 • 




.80 


.44 




.89 . 


.49' . 




.99 • 


,'.40 




.96 ^ 


.34 




.78 




f 


.81 


.45 - 




1.19V . 


.34* 




« .88 



fjean - this Wluffin contains the means of 
class report means. „ Means fvom combined 
fepbrts and also those from individual 
class repofrts R)?y be ^ compared tto these 
means. . ' ( ^ 

Standard Devi a tibn of the H#ans - this 
column contains- the standard deviations 
bf the class report jPMns. It is appro- 
priate ^to ciimpsfre-tfie stan^^aVd dfeviattons 
^royombined reports with, thes^ figi|^s. 



^ Average StandarjJ t^eviation - th^'s 
crflumn contains the average stari-f 
dard deviatiprfs^ of Xh^ cl^ss. 
report fneans. I'?' is^j^s^'propriate 
to-eoftipare the stafl^rd devia-" 

^'j^tions in an individual instruc- 
^;tor's^c1asS' repo^t with th^se ^ \ 

. * f igtit;es. . ^ ' ^ , 



ERIC 



/ 



, APPENDDfiD, 

- Pages 85-119, 
REMOVED VVI THE AUTHOR'S REOUEST, 



9 



